Skip to Content

5 Ways To Build Trustworthy AI Agents

An AI agent and a human worker hold a flag with an image of a checkmark on it: trustworthy AI
Building trustworthy AI agents starts by designing AI systems that allow humans to partner safely and easily with AI. [Salesforce | Aleona Pollauf]

Here’s how we designed Agentforce to make sure it’s trustworthy, reliable, and transparent.

Imagine a world where AI agents handle many of your everyday tasks, freeing you up to focus on what truly matters — building relationships, making strategic decisions, and innovating. This isn’t a distant dream, though: It’s the reality we’re shaping in partnership with our customers at Salesforce. But the potential of artificial intelligence (AI) agents can only be realized if they’re trusted to act on someone’s behalf. As we enter this new era of agentic AI, it’s important to build AI agents that are not only effective, but trustworthy.

How do we build trustworthy AI agents? We start by designing AI systems that allow humans to partner safely and easily with AI. Our approach is built on intentional design and system-level controls that emphasize and prioritize transparency, accountability, and safeguards.

Imagine a workforce with no limits

Transform the way work gets done across every role, workflow, and industry with autonomous AI agents.

Trust patterns for Agentforce

Trust in AI is still in its infancy. Many customers expect humans to remain involved in nearly all use cases, especially those considered high risk. Our Office of Ethical and Humane Use addresses this by codesigning and codeveloping ethical controls and processes. We partner with Agentforce Product, Engineering, and Design teams, as well as our colleagues in Legal, Security, Privacy, AI Research, and Research & Insights to implement trust patterns that help create trustworthy AI agents.

These patterns — standard guardrails implemented across our AI products — are designed to improve safety, accuracy, and trust while empowering human users. Our latest set of patterns is designed to make sure our agentic AI solutions, including Agentforce, do just that. Here are our top five patterns for building trustworthy AI agents.

1. Reduce hallucinations with topic classification 

A topic is a category of actions related to a particular job to be done by an agent. Topics contain actions, which are the tools available for the job, and instructions, which tell the agent how to make decisions. Collectively, topics define the range of capabilities an agent can handle. An AI service agent might have topics defined that handle order status, warranties, returns, refunds, and exchanges — and anything else gets escalated in a human’s queue. This way, the AI agent doesn’t try to answer a question it shouldn’t, reducing the propensity to hallucinate. Topics can also manage and redirect unwanted inputs and outputs. For instance, an administrator can set up a topic dedicated to prompt injection, which an agent can use if a person asks about proprietary or system information that falls outside of the agent’s intended scope.

2. Limit the frequency of agent-generated emails 

Administrators can cap the frequency of AI outreach to provide better experiences for the recipient. For example, you wouldn’t want an Agentforce Sales Development Representative (ASDR) to contact the same prospect 100 times in a minute. Capping email frequency during setup prevents overreach, reduces the likelihood of email fatigue and opt-outs, and preserves email domain integrity.

3. Respect user privacy with opt-out features 

Agentforce products let customers and prospects opt out of communications. This feature, integrated directly into our customer relationship management (CRM) software, provides a seamless customer experience, allowing humans to control how many communications and emails they receive.

4. Create transparency by design 

To build trust, it’s important that users know when they’re interacting with an AI agent. By default, Agentforce products use standard language to alert administrators and agent managers when they’re about to implement or use AI agents. These notes highlight the capabilities and limitations of AI, ensuring a clear understanding of its impact and potential. Tools like ASDR ship with default disclosures to make sure the recipients of agent-generated emails know they were created and executed by AI. A standard disclosure appears within the first two sentences of a generated email, which can be edited by the employee overseeing the agent. A nonremovable and non-editable disclaimer below the signature line adds extra transparency.

5. Ensure smooth AI-human handoffs

A successful implementation of agentic AI requires seamless transitions between AI agents and the human workers they support. For example, ASDR does this by copying a sales manager on each agent’s email. The same goes for Agentforce Service Agent, which ensures a smooth handoff between the AI service agent and the service rep. This approach fosters an AI-human partnership in every interaction while building trust. In the future, dashboards can provide a more streamlined way to ensure human oversight and AI agent accountability.

These trust patterns are implemented in addition to the Einstein Trust Layer, a foundational element of our AI systems that ensures transparency and control. Within the Trust Layer is our Audit Trail feature, which lets users see what the AI agent did, why it did it, and the outcomes of its actions. This level of transparency is crucial for building trust and making sure that AI operates within ethical boundaries.

User guidance and guardrails

We’ve worked closely with Agentforce teams to build ethics guidance and guardrails into the product’s user interface and supporting documents. This includes comprehensive help and training resources to make sure the person setting up and using the tool understands how to work with AI.

We’ve also developed best practices based on research and testing that help customers get the most out of their agents. Some best practices include:

  • Start with minimal topic instructions. When configuring a new topic, begin with the simplest set of instructions necessary to achieve a basic end-to-end workflow.
  • Be transparent about the use of AI. When sending communications, don’t misrepresent the sender as a person. Add text that clearly identifies the sender as an AI agent and states the autogenerated nature of communications.
  • Adhere to working hours. Make sure messages are sent during times when recipients are most likely to be available. Be mindful of the recipient’s time zone.

Verify trust with ethical testing

To ensure the reliability of our AI agents and relevant safeguards, we conduct rigorous testing and red teaming. This includes adversarial testing. Before launching Agentforce, we subjected our AI agents to over 8,000 adversarial inputs to pressure-test their boundaries. We also involved employees representing diverse perspectives, backgrounds, and lived experiences in trust testing to make sure our AI systems meet the highest standards of reliability and trustworthiness. 

Agentforce: Built on trustworthy AI

Building trust in AI is a journey that requires careful design, rigorous testing, and ongoing innovation. As we look to the future, we’re excited to continue evolving our trust patterns and AI capabilities to include guardrails on topic constraints, auto-escalation to humans, monitoring functionalities, and enhanced Trust Layer innovations. By focusing on intentional design, system-level controls, and implementing trust patterns, we’re paving the way for a future where humans and AI can work together seamlessly and effectively. As we continue to evolve our AI capabilities, we remain dedicated to guiding our customers toward a more autonomous future with thoughtful controls and processes.

Bringing trust to life through AI ethics

The Office of Ethical and Humane Use guides the responsible development and deployment of AI, both internally and with our customers. 

Get the latest articles in your inbox.