Introduction
AI isn’t just transforming our tools, but the way we think about tools, from something we merely use to something more akin to a partner; a technology we delegate tasks to and even collaborate with. Indeed, the notion of autonomous AI systems – i.e., systems that can execute tasks, either in support or on behalf of humans, with various degrees of autonomy – is rapidly emerging as a defining design pattern of the era: one or more models (like the Large Action Models I’ve discussed recently) empowered with the ability to leverage external tools, access up-to-date information beyond their training data, and reachable across an organization via APIs, all united by a conversational interface that allows them to interact with users through natural language. And the pace is quickening—by 2028, in fact, Gartner predicts that interactions with agents will account for a full third of all generative AI use within the enterprise.
But here’s what might be the most important thing about autonomous AI systems: they have a remarkable ability to evolve over time, and in distinct ways. The first applies to what I call AI Assistants that adapt in unique, individually-tailored ways to better understand a single user. The second will be seen in what I call AI Agents, who fit into a team or organization in a more collaborative, open way, using those same powers of personalization to pick up on what matters most to their organizations and peers—practices, processes, tools, and so much more. Simply put, AI Assistants are built to be personalized, while AI Agents are built to be shared (and scaled)—and both techniques promise extraordinary opportunities across the enterprise.
The power of learning over time
Both varieties of autonomous AI systems (AI assistants and AI agents) are defined by agency—the ability to act in meaningful ways, sometimes entirely on their own, in the pursuit of a shared organizational goal. We can interact with them just like we do with our human co-workers – by asking questions, issuing commands, and confirming our wishes. They can talk back, too, whether they’re presenting a proposal for approval or simply clarifying a detail. And because they have the potential to get to know us over time, remembering what we’ve done and said in the past, the depth and efficiency of these interactions will increase.
This notion of learning and improving through repetition is a fundamental part of all agents, but crucial differences exist in how they’ll do so. In the case of the AI Assistant, learning is an essential part of developing a more responsive, efficient working relationship over time. They’ll identify habits, expectations, and even working rhythms unique to a specific person. Naturally, however, privacy and security are non-negotiable parts of the picture; no matter how powerful, no one wants an assistant they can’t trust.
This trust dimension will be fundamental for AI Agents too. But, when it comes to learning, and in contrast to AI Assistants, AI Agents are meant to learn shared practices like tools and team workflows. And far from being private, they’ll learn the kind of information that every instance of such AI Agents — whether an organization deployed 10 or 10,000 — should disseminate as soon as possible. In other words, as each individual AI Agent improves its performance through learning and field experience, every other worker of that type should make the same gains, immediately.
Lastly, both kinds of autonomous AI systems should be able to learn from external sources — imagine, for instance, an update that spans an entire enterprise, ensuring each deployed agent is instantly up-to-date with a new app, feature, or policy change.
Solving problems at every scale, from the personal to the organizational
AI Agents are all about specialization, and come in two primary forms. The more sophisticated of the two would take over meaningful tasks in domains like marketing, sales, customer service, and IT, and can interact with a team through natural language communication. They can respond to requests via messaging platforms, access shared information and resources, and even participate in conversations—perhaps joining a brainstorming session or kickoff meeting just like a person would. Imagine, for example, tasks that normally tend to get in the way of more meaningful work: AI agents can take the baton, ready to work alongside people on individual tasks, group projects, and whatever else their organization may need. These AI agents can immediately fill in gaps where warranted while allowing human workers to onboard faster, emphasize their natural affinity for soft skills, and focus more on the design and creative aspects of a project.
I call the simpler of the two forms “Task Bots.” They’re much more focused than fully-realized AI Agents, designed to perform similar workloads but without the emphasis on human communication, making them more like a single task tool. Task Bots are designed for support behind the scenes — AI Agents and AI Assistants alike can instantiate them flexibly, as needed, augmenting their ability to solve problems on a case-by-case basis and making the most efficient use of an organization’s overall resources. Task Bots and full-scale AI agents may seem conceptually distinct, but really only differ at the implementation level in practice. We interact with them differently, but at their core lies the same capacity for getting work done. Interestingly, research consistently shows that AI performs better when broken down into multiple, purpose-built agents, each designed with different kinds of “expertise” in mind; in fact, this form of specialization is a prime use case for the case I recently made for smaller, bespoke LLMs that contrast powerfully with the multi-billion parameter giants that generate so many headlines. As with traditional organizations, it’s an approach that allows smaller problems to be solved more efficiently, and perhaps even in parallel, by a coalition of purpose-built AI agents.
Applications and Real-World Impact
Collectively, these ideas add up to nothing less than a revolution in the way we work. Already, we’re exploring use cases ranging from sales enablement and customer service to full-on IT support, with many more to come. Imagine, for example, a packed schedule of sales meetings, ranging from video calls to in-person trips across the globe, stretching across the busiest month of the season. It’s a hectic reality for sales professionals in just about every industry, but it’s made far more complex by the need to manually curate the growing treasure trove of CRM data generated along the way. But what if an AI assistant, tirelessly tagging along from one meeting to the next, automatically tracked all relevant details and organized them with high precision, with the ability to answer questions on-demand about all of it? How much easier would that schedule be? How much more alert and present would the sales person be, knowing their sole responsibility was to focus on the conversation and the formation of a meaningful relationship?
What’s especially interesting is visualizing how it all works. Imagine your AI assistant as the agent present during each meeting, following the conversation from one moment to the next, and developing an ever-deeper understanding of your needs, behavior, and work habits—with an emphasis, of course, on privacy. As your AI assistant recognizes the need to accomplish specific tasks, however, from retrieving organizational information like CRMs to looking up information on the internet, summarizing meeting notes or something else entirely, it delegates specific subtasks to pass to AI agents (for higher level subtasks) or directly to Task Bots (for single specific subtasks), which can be instantiated as needed to help the assistant achieve an overall goal on behalf of its user. It looks something like this:
At the other end of the spectrum, imagine how many support tickets the average IT desk faces throughout the day at even a small or medium-sized business, let alone a global enterprise. Human attention will always be prized in the case of complex, unusual challenges that require the fullness of our ingenuity. But the vast majority of daily fixes are far less unique. For customers and support staff alike, this is the worst of both worlds: even trivial problems incur the same delays and headache as serious ones. Once again, AI agents can change all of that. With its human-like abilities to converse and formulate solutions and plans, large amounts of inbound requests can be taken over effectively. Behind the scenes, these AI Agents will reach out to Task Bots to turn a request into a solution— e.g., retrieving information, or generating a plain-language summary of the outcome. Again, AI Agents can be instantiated as needed by human managers – scaling up and down naturally with demand, all running at maximum speed 24/7/365. And of course, it promises relief for overworked IT professionals and reduced wait times for customers and organizations as a whole.
The Challenges Ahead
As always, achieving something so impactful won’t be easy, and challenges of technical, societal, and even ethical nature lay ahead. Chief among them is the challenge of persistence and memory. If we wish, AI Assistants will know us well, from our long-term plans to the daily habits and quirks that make us unique as individuals. We should expect each new interaction to build on a foundation of previous experiences, just as we do with our friends and coworkers. AI Agents, on the other hand, aren’t designed to be personalized in the same way, but should still express a persona of their own — a kind of digital personality configured to match the needs of their role and the team in which they’re embedded. A marketing AI Agent might be quirky and creative, while a sales AI Agent might be all business — hyper-focused and aggressively goal-oriented. And of course, deciding which tasks, actions and workflows should be assigned to human workers versus their AI Agent colleagues is a central underpinning of this evolving discussion.
But achieving this with current AI models isn’t trivial. Compute and storage costs, latency considerations, and even algorithmic limitations are all complicating factors in our efforts to build autonomous AI systems with rich, robust memory and attention to detail. We also have a lot to learn from ourselves; consider the way we naturally “prune” unnecessary details from what we see and hear, retaining only those details we imagine will be most relevant in the future rather than attempting unreasonable feats of brute force memorization. Whether it’s a meeting, a classroom lecture, or even a conversation with a friend, humans are remarkably good at compressing minutes, or even hours of information into a few key takeaways. AI assistants will need to have similar capabilities. I believe progress is encouraging, and even within my own research team here at Salesforce, overcoming these challenges is an exciting and long-term effort.
Next, even more important than the depth of an AI Assistant or AI Agent’s memory is our ability to trust what comes out of it. For all its remarkable power, generative AI is still often beset by questions of reliability, as terms like “hallucination” have entered the public lexicon. At its core, however, hallucinations are hardly a mystery — they tend to occur when a model lacks both the information necessary to answer a question accurately and lacks an awareness of where such gaps in its knowledge exist to begin with. An AI Assistant or AI Agent’s propensity for continued learning will play a role in helping address these gaps, but more must be done along the way. One measure is the burgeoning practice of assigning confidence scores to LLM output that helps users assess to what degree it can be taken at face value. Additionally, Retrieval Augmented Generation, or RAG, is one of a growing number of techniques that allow humans to work proactively with an LLM’s lack of information by coupling relevant information with each request — a collection of relevant documents or knowledge base entries, for instance, paired with a request to process or answer questions about their contents in some way — ensuring the model has the resources it needs from the very beginning.
I imagine the ethical considerations will be similarly complex. For instance, will the emergence of networks of such autonomous AI systems as well as teams of AI agents, bring with them the need for entirely new protocols and norms? Specifically, how should such agents and teams talk to each other? How should they build consensus, resolve disputes and ambiguities, and develop confidence in a given course of action? How can we calibrate their tolerance for risk or approach to conflicting goals like expenditures of time vs. money? And regardless of what they value, how can we ensure that their decisions are maximally transparent and easily scrutinized in the event of an outcome we don’t like? In short, what does accountability look like in a world of such sophisticated automation?
One thing is for sure—human management should always be the bottom line in how, when, and why digital agents are deployed. I believe digital AI Agents in particular can be a powerful addition to just about any team, but only if the human members of that team are fully aware of their presence and have confidence that the managers they already know and trust are fully in control. Additionally, I believe all interactions with all forms of AI should be clearly marked as such, with no attempt—whether well intentioned or not—to blur the lines between human and machine. I’m strongly against any kind of deception around how such agents are presented to the world. As important as it will be to formalize thoughtful protocols for communication between such agents, protocols for communication between AI and humans will be at least as important, if not more so.
Conclusion
Even after years of AI changing the world again and again — or at least, certain aspects of it — the emergence of reliable, flexible autonomous AI systems may herald the biggest transformation yet in the way we live and work. Much remains to be done, both in terms of technological implementation and the practices and guidance required to ensure their impact is a beneficial and equitable one. But so many of their benefits are already revealing themselves, and I’m reminded every day of how unique and profound this chapter of AI history is proving itself to be.
Alex Varanese contributed to this story.