What is one of the most important actions you can take to successfully implement agentic AI? If you said ‘managing data,’ give yourself a pat on the back.
Need proof? Consider the airline forced to reimburse a traveler after a bot gave them incorrect information about its bereavement policy. Or the bot that told New York entrepreneurs it was okay to break certain laws. Or the AI that insisted a star NBA player went on a spree vandalizing homes.
What went wrong in those examples? It could be poor oversight or lack of proper guardrails. Or, more likely, something went wrong with the data. Bad or mislabeled data, data from unknown sources, or data that’s trapped or scattered over multiple systems — these scenarios represent a common culprit that could doom agentic AI.
Recent research highlights this reality. Nearly 6 in 10 AI users say it’s difficult to get what they want out of AI right now, with over half claiming they don’t trust the data used to train today’s AI systems. This lack of trust in data could put a serious crimp in AI adoption, and hinder your ability to compete.
AI agents promise to completely reshape how work gets done, and how customer relationships are managed, but it requires data that is accurate, updated, accessible, and complete. This is called data-centric AI, and is predicated on the notion that AI systems are developed using only quality data.
What your company can do now
Here are two key steps your business can take now to get your data house in order:
- Connect your data sources — marketing, sales, service, commerce – into a single record, updated in real time, so an AI agent can execute tasks accurately and with the full breadth of data at its disposal.
- Ensure the quality of your data. Remove duplicates, outliers, errors, and other things that can negatively affect AI outputs.
Let’s break down each.
Connect all your enterprise data
Real-time access to quality data, across the organization, is the bedrock of successful AI. But many companies struggle. Data lives in silos, in different formats, all over the place. In fact, businesses say a lack of data harmonization is the second biggest barrier to extracting value from their data.
Platforms like Salesforce Data Cloud connect all data sources, both structured and unstructured, regardless of data type, into one unified, integrated platform. Recent additions to the platform include support for native processing of audio and video, including webinars and calls, and a semantic data model that helps AI agents and humans interpret and use data consistently.
All of this data grounds Agentforce, Salesforce’s suite of AI agent tools, and makes agents contextually aware and knowledgeable. For example, an Agentforce Service Agent has access to past emails, support tickets, voicemails, and any other sources defined by the organization, to better understand the customer’s needs. This data guides the AI agent with next-best steps, like automating a follow-up email or sharing a chat summary to a human rep.
Lay the groundwork for data-centric AI agents
Your data doesn’t need to be perfect to build an effective AI agent program, but it needs to be clean. That means free of errors, incorrect formats, duplicates, or mislabelings.
The data experts at Tableau offer a comprehensive template to systematically clean your data, a crucial first step in unifying datasets for AI projects.
Remove duplicate or irrelevant observations
Duplication happens when you combine data sets from multiple places, and copies are created. Irrelevant observations happen when data (say, on elderly consumers) doesn’t fit into a problem you’re trying to analyze (say, millennial shopping habits). Removing these makes analysis more efficient, useful, and accurate for an AI system.
Fix structural errors
Structural errors occur when data includes typos, incorrect capitalization, or mislabelings. For example, “N/A” and “not applicable” mean the same thing, but aren’t analyzed the same way because they’re rendered differently. The entries should be consistent to ensure accurate and complete analysis by the AI system.
Filter unwanted outliers
There are often one-off observations that don’t align with the data you’re analyzing. That might be the result of incorrect data entry (and should be removed), but sometimes the outlier helps prove a theory you’re working on. In any case, analysis is needed to determine its validity.
Handle missing data
Missing or incomplete data is a very common problem in data sets, and can reduce the accuracy of AI models. There are a few ways to deal with this:
- Eliminate observations that include missing values; however, this will result in lost information.
- Input missing values based on other observations; however, you may lose data integrity because you’re operating from assumptions and not actual observations.
- Change the way the data is used to effectively navigate the missing values.
Validate
After cleaning the data, you should be able to answer these questions:
- Does the data make sense?
- Does the data follow the appropriate rules for its field?
- Does it prove or disprove your theory, or surface any insight?
- Can you find trends that help inform the next theory? If not, is that because of continued data quality issues?
In the era of AI, data is precious
Soon, data won’t just support operations — it will be the backbone of systems, workflows, customer interactions, and automated processes. It’ll be woven into everything, powering decisions and triggering actions in real-time, with human oversight ensuring accountability. The companies that thrive won’t just collect data — they’ll integrate it with technologies so they can take advantage of new capabilities and opportunities.
This represents differentiation on a massive scale.
As Rahul Auradkar, EVP and GM for Data Cloud at Salesforce said recently, “In this new era of AI and agents, customer data and metadata are the new gold for the enterprise.”
Data is so crucial to the success of agentic AI, it may be even more precious than gold.