Artificial intelligence has come a long way from its early beginnings, evolving from science fiction to a powerful force shaping industries across the globe. But how did we get here, and what’s next for artificial intelligence (AI) as it continues to transform the way we live and work? To explore this, I had a far-reaching discussion on the Dreamforce stage recently with two of Salesforce’s seminal experts who’ve been involved in shaping AI since its earliest days.
Silvio Savarese is executive vice president and chief scientist at Salesforce, and Peter Schwartz is chief futurist at Salesforce. This conversation has been edited for clarity and to fit this format.
The state of AI: How did we get here?
Asseo: Thank you both for being here to discuss a topic that has defined both of your careers. We’re going to cover three main areas: The history of AI, and how we got here. Then we will explore where we are at this moment, and finally, what the future may hold.
Peter, I’m going to start with you. You like to remind people that you are the oldest employee at Salesforce. You’ve seen a lot of innovation cycles in your lifetime. AI itself is not new, and has been around for a long time. Can you give us a little bit of a perspective?
Schwartz: In fact, I just turned 78, and I’ve been in this business of trying to think about the future for over 50 years. I started my career at a place called Stanford Research Institute, which was one of the two hubs of AI research in America. So there was SRI in Stanford, and MIT. I was an active participant in the conversation around AI beginning around 1972, and the truth is, it failed. It was the beginning of the first “AI winter.” (AI winter refers to a period of reduced funding and interest in AI)
The original idea around AI was to understand the human brain — how it functions — and then build computer models that simulate it. But the problem with that strategy was that understanding the brain has proved to be much more difficult than anybody really expected. That led to the first AI winter in the late ’70s.
There was then another wave of interest with the birth of parallel computing. A very dear friend of mine, Danny Hillis, invented parallel computing, and had a company in the mid ’80s called Thinking Machines. And, in fact, he had a slogan on his door, which was great: “I want to build a machine that will be proud of me.” Well, he failed. But he realized that we need a completely different strategy.
In the 1990s and early 2000s, AI had nothing to do with the model of the brain. There was no cognitive model behind the mathematics of AI, and that has opened up new possibilities, because we’re no longer constrained by the limits of what we understand about the brain.
Asseo: Silvio, you’ve been at the forefront of AI research for many years. Can you help us make that connection between some of the history that Peter took us through, and how we got to today?
Savarese: I started my Ph.D. in 2000 during one of those AI winters. It might have been the seventh, or the eighth. Back then, AI was not a sexy topic. The technology was not mature enough to be brought into production. It was a challenging time. Nevertheless, it was fascinating, which is why I studied it.
At the time, we were using Bayesian models, which are powerful statistical tools for decision-making. To your point, Peter, these networks were data-driven models, and were not trying to mimic the brain, or be inspired by how it works. But the issue with data-driven networks is that they require a lot of feature engineering. That means you have to do a lot of preprocessing to make the data compatible with the model. The model couldn’t just absorb data, so you had to preprocess it, and therefore, the performance was not great.
Fast-forward to 2010: We started reusing neural networks. Neural networks were popular several decades earlier, but all of a sudden, they became much more interesting, because they could process data without any feature engineering. They could just use consumed data as is without preprocessing. This made them versatile in many different use cases. It was incredible to see how all of a sudden, performance was skyrocketing!
Fast-forward another decade: We are now entering the era of transformers and attention models. And now, not only do we not need feature engineering, we don’t need annotations either, which means that they can consume billions of tokens of data. This is why these models became so huge, and why we call them large language models. The behavior we observed, though, was unexpected. We saw that all of a sudden we could speak with the models in natural language: to do reasoning, and to generate content, text, videos, images, and even plans. And that’s what makes them so exciting.
Asseo: What is the next step? Where do we go from here?
Savarese: The next step is what we call large action models, which is the evolution, in our opinion, of large language models. By training these models, they can predict and generate text, and understand how to behave, how to take action, and how to use this feedback from the environment to improve their performance.
Schwartz: So from the original idea of large action models to now launching Agentforce, the space and time between the original concept and the ability to create a commercial product is astonishing!
The state of AI: What’s happening now?
Asseo: Indeed, this is, thanks to the incredible work by the AI Research team here at Salesforce, which has written some of the original papers about the transformer that later led to the creation of what we all know now as Generative Pre-trained Transformers, or GPT.
Let’s talk about the present. Over the last couple years, with all the buzz around generative AI, there’s been a tremendous amount of change. Just last year we were talking about copilots, now we are talking about agents. It’s moving at an astonishing speed, and maybe because of that, there is a lot of uncertainty. Peter, you deal with uncertainties every day as a futurist. What are you hearing from business leaders and customers?
Schwartz: So, there’s a whole bunch of big questions on the minds of decision makers and companies: How quickly will the technology develop? How quickly will applications be successfully deployed? What are the consequences? What are the regulations that come along that help manage the use of AI? This morning, I was with the head of a big bank. They’re very enthusiastic about it, but they have to be very careful to get approval from regulators before they can do almost anything with AI.
I heard the same thing from the head of a healthcare company yesterday. They have to be very careful in terms of what happens by way of regulation, particularly with patient data. And that’s just one part of it. So, there are various scenarios about how this could play out. Regulations around personally identifiable information (PII) might play out very quickly and smoothly into the future, or it’s entirely plausible that we’ll go through a downturn in AI deployments. As people try stuff, some things work, some don’t. It’s a classic product cycle, and that’s the shakeout in the learning process. There are literally thousands of startups in this industry. Most of them will fail. But some will become the giants of the next generation. There may be an Apple or a Google lurking out there. Salesforce just bought a company in the AI voice space called Tenyx. And soon you’ll be able to talk to your AI in Salesforce.
I think these uncertainties are going to be with us for some time. There’s a really good book called “Co-Intelligence” by Ethan Mollick, which I highly recommend. He recommends doing scenario planning to deal with the many uncertainties around AI.
Asseo: There is a famous saying that goes, “The best way to predict the future is to create it.” There are so many innovations coming out of the AI research group at Salesforce. For example, during the keynote, Marc Benioff talked about the Atlas Reasoning Engine, which was influenced by much of that work. Silvio, can you talk about some of the key technologies coming from our research group that are informing and driving that innovation for Agentforce?
Savarese: Let me step back and describe what we are actually building here with Agentforce. We are creating two types of autonomous systems. One is called an AI assistant, and the other type, we call AI agents. AI assistants will work closely with humans and help humans in performing daily tasks such as writing emails, booking appointments, making reservations, and so on. This requires a high level of personalization, so the assistant needs to understand what the user preferences are and align those preferences to perform a task successfully. In this scenario, a human should guide the assistant (a “human in the loop”) in helping them perform those tasks.
On the other hand, we have AI agents, which are personalized not to just one user, but to a group of users or even an entire organization. They are specialized in specific skills and are assigned to specific roles.They can be introduced on demand by the organizations to help scale up and perform complex tasks.
With Agentforce we are, in fact, including both types of agents. What is common across all these different agents and assistants are two things: the memory and the brain.
The memory allows these assistants and agents to remember what happened before, such as conversations, and also to pull information critical for performing tasks, such as information about products, customers, policies, or best practices. All of this enriches those agents with the right type of knowledge. And in partnership with the engineering teams, we’re also building the next generation of [our] so-called RAG system, or retrieval augmented generation. This allows us to effectively extract information from repositories and provide it to the agents.
The second important component is the brain, which is, speaking more technically, the reasoning engine. (Salesforce’s reasoning engine was incubated within its AI Research team). It is used to decompose a task into a series of steps to be performed by an agent. This also requires us to be able to gather feedback from the environment.
Asseo: If, indeed, we’re moving from AI generating text to taking action, and we’re enabling more autonomy, it also has the opportunity to break trust. If these agents do something we don’t want them to do or don’t intend them to do, or might offend someone, or even worse. And so how are we approaching trust with this agentic technology?
Savarese: Enabling these agents to be trustworthy is indeed a complex problem. We have to make sure that whatever we are building is safe for customers to use. In a way, we are actually in a very challenging spot. Generative AI for consumers has much fewer risks. If an agent comes up with the wrong recommendation for a dinner reservation, for example, you may be annoyed, but not much else happens. On the other hand, for businesses, if an agent executes plans in the wrong way, it could actually have really devastating consequences.
So we’re building a number of guardrails which allow the agents to operate within safe and trusted boundaries. And these guardrails are designed leveraging best practices. We can inject guardrails and business logic into the way these agents are built.
This is where it is important to have an iterative process where we can gather feedback from users and understand the areas where the agents can improve. This is where transparency is very important. It makes sure that those agents can disclose and reveal to users what they are trying to do in some critical decision points, and in situations where there is potential risk that can be detrimental to the business.
Asseo: That sounds very nuanced, and I’m wondering, Peter, what kind of implication does that have for society, in terms of trust and the autonomy that we’re starting to give these agents?
Schwartz: I do think this is a historic shift. We’re in a new world. And the truth is that we don’t really know, as there’s a lot of uncertainty. But what we’re going to see is AI agents everywhere, ubiquitous in many different contexts. They will be taking care of things in the background, so that you don’t have to think about it. And that’s the key operating phrase: You don’t have to think about it.
Now the truth is, as Silvio said, we still need to have people in that loop with the technology, and we will for quite some time. One of the movies that I helped write, “War Games,” has a moment near the end where Matthew Broderick is in the war room and the Soviet missiles appear to be coming over the pole, and John Wood, the computer scientist, points to the screen and says, “Don’t you realize that’s a computer hallucination?’ In a way, we invented computer hallucinations back in 1979 in the film. But it is a hint of what we actually have to do. It was a human being who recognized that this was a game and not reality. And I think that’s the kind of situation we’re going to find ourselves in frequently, where we need human intervention for quite some time, to assure that what the agents are doing is aligned with human intention.
The state of AI: What’s next?
Asseo: So, you’ve now taken us to the future, Peter, which is our final section. And as we’re moving into this world where these autonomous agents are going to be ubiquitous, where we’re going to all have our personal assistants, let’s explore that a bit. Silvio, you’ve done a lot of work in the area of robotics. How is that going to translate and what is that world going to look like when you combine agents and robotics?
Savarese: In fact, for many years in my career at Stanford, I was working at the Robotics Lab, doing research in robotic perception, which enables robots to understand and see the world. Back then, we were using huge robots with large arms, and we were teaching them to do things like cook an omelet or make espresso. And as I look back at these research challenges, we are facing very similar challenges. Building a robot and building an agent are actually very similar problems.
If you think about an agent booking a trip for you, or building a robot that makes an omelet, you actually do very similar tasks. They both need to have memory, as they have to remember the ingredients for the omelet. They have to remember where the tools are for cooking the omelet. And similarly, the digital agents have to remember which websites to access for booking a flight reservation.
Similarly, they need a brain, because the robot has to come up with a plan of making an omelet. And so there are steps and procedures. And for booking a trip, the agent also has to come up with a sequence of steps that allows them to complete it successfully.
One important thing which is common in both cases is the environment, which can be adversarial. You might have a situation where the robot may not find the eggs or can’t locate the oil or the pans. What then? They cannot just give up, right? They have to come up with a plan B. Similarly, if an agent is trying to book a trip and can’t find the tickets, or find the right flight, what do you do? You have to come up with an alternate plan. This ability to adapt to an environment which is dynamic or even adversarial is critical.
What robots and agents do not have in common is the fact that the agent is digital and the robot is physical, which means that we have to build the right capability, using sensory systems (vision, touch, sound) which will pave the way for the next generations of agents.
Asseo: What advice would you give business leaders and really, anyone, who is starting to experiment with AI at work?
Schwartz: Get your hands in the dirt. Build an agent. You should start playing with different models. See what works for you, what works in the areas that you deal with. It’s so early that experimentation is highly rewarded.
Savarese: And let’s not forget about trust. Let’s not forget about building these agents in a way that is safe. I want to highlight two important points. Let’s make sure that there is transparency in ensuring that it’s clear what is generated by AI and what is generated by humans. We need to have clear distinctions in our minds and our consumers’ minds. The second point is accountability. In a scenario where one party’s agents are going to try to do something which may conflict with another party’s agents, who will be accountable in case of a disagreement? How will we regulate this type of scenario? This is very important, as it is coming very soon. It’s going to be a problem that we will need to face quickly.