Open Source and the Future of Enterprise AI

In the kind of production environment our customers operate in, models are just the start.

Silvio Savarese

Caiming Xiong

September 25, 2023 9 min read

Einstein Copilot has arrived! Find out more about the conversational AI for CRM here.

Introduction

Open source has become one of the hottest topics in AI, and the fanfare is well-deserved. The open source community is keeping a nimble pace with the state of the art, delivering ever-growing and ever-more-capable models that often compete impressively with their commercial counterparts. It’s an exciting thing to watch. Speaking as researchers who want AI to do as much good as possible, there are many upsides to this; AI has never been more accessible, both in terms of cost and technical feasibility, and we hope this trend continues. For the countless businesses not quite ready to build models from scratch, but need a level of control, privacy, and reliability third-party APIs can’t provide, open source is an increasingly attractive option.

Still, though, open source models aren’t a replacement for a technical partner like Salesforce—one that doesn’t just understand AI, but enterprise AI. For starters, open-source models are still insufficient for a wide range of problems, including those that demand bleeding edge scale and performance or depend on sophisticated multi-modality (the blending of data across varying media, like text and images). But even when an open source option is perfectly suited to one’s needs, there are a number of reasons why they’re only a piece of the puzzle. In the kind of production environment our customers operate in, models are just the start—even the biggest and best of them—and it takes much more to create the kind of reliable, end-to-end tool those environments demand. That’s why, although we’re proud of our technology, we offer much more than that.

Four Challenges for Open Source AI in the Enterprise

The reality of AI is complex, and having a model in hand is just the beginning of an integration process that connects the power of AI to the people, organizations, and processes that need it. At Salesforce, this is one of our founding missions as a company. For decades, we’ve turned the bleeding edge of data, applications, cloud services, and AI into customer-oriented products and services. In fact, many of our customers have no particular AI experience at all, despite realizing the essential role it’s already playing in their world. Allowing them to simply tap into its power through the same APIs and interfaces they use for everything else makes all the difference.

Enterprise AI Challenge #1: Trust and Safety

We believe that any conversation about AI should start by discussing trust and safety, and this is especially salient when considering open-source options. For all its obvious power, what’s less obvious is how exactly one turns generative AI into a technology so reliable that the world’s biggest brands can deploy it safely at a global scale—a hard enough challenge when building something entirely in-house. Biased, toxic, or simply unreliable content is a liability at any scale, whether it’s a model creating a novel image for marketing purposes or an LLM composing a textual analysis of customer data.

At the same time, copyright is emerging as one of the biggest roadblocks to reaching the full potential of generative AI, with pressure mounting worldwide to ensure models produce content that doesn’t violate the ownership of writers, artists, and developers. This is a worthy goal on ethical grounds alone, of course. But for professional applications of all kinds, from small businesses to global leaders, even a single copyright violation is one too many. This alone may disqualify pre-trained open-source models for many enterprises, as a full accounting of the training data used to arrive at the finished product is impractical. Besides, the influence of unwanted data has been dispersed across hundreds of millions of weights, it’s all but impossible to counteract.

This is about as complex as challenges get, and exactly what a research org like ours confronts every day. We’ve invested heavily, for instance, in a growing toolset for detecting and counteracting violations of content ownership, bias, toxicity, and the like. But even our abilities as researchers are only one piece of the puzzle. Another is the decades of hands-on experience our customers have shared with us since our founding, allowing us to not only make measurable progress on AI safety in theory but in industry-specific practice. Simply put, we dedicate as much time to ensuring our models are safe as we do building them.

Enterprise AI Challenge #2: Infrastructure

In particular, infrastructure is among the most challenging aspects of any AI deployment, with data presenting a particular challenge in the enterprise.

Security and Compliance

Most urgently, any deployment must be built on a foundation of security that turns infrastructure into a continuously trusted resource. This is a challenge of its own, of course, but Generative AI complicates it considerably. Apart from its primary functionality, any generative AI model must integrate reliably and effectively with an organization’s security and compliance regime, respecting access controls and data privacy as rigorously as any other deployment.

Personalization and Quality

As our standards of personalization and quality rise, generative AI capabilities must be grounded in data repositories of data that are harmonized. Imagine, for instance, the automation of marketing email campaigns; with a stock model, the results will likely be bland and impersonal. But with sufficient volumes of harmonized data guiding it, that same model can embody your brand voice and even embed personalized touches. Moreover, as we see the proliferation of techniques like RAG (Retrieval Augmented Generation, which combines the traditional power of database lookups with the humanlike flexibility of an LLM) integration with vector databases will be essential too.

A/B Testing

Finally, administrators will need flexible tools for A/B testing competing models—especially as the number of open source options grows and their variations in performance reliability become more consequential. It’s worth remembering that although often seen as exotic and unique, AI is, at the end of the day, a tool like any other. Optimizing results depends on flexibility, choice, and the ability to quickly and clearly compare options.

These are far from trivial problems—both in terms of technical sophistication and cost—and solving them requires a blend of expertise in AI and the enterprise.

Enterprise AI Challenge #3: Fine-Tuning

Fine-tuning is another roadblock to effectively harnessing AI for companies all over the world. It’s the process by which an off-the-shelf model is optimized for use within a specific industry, domain, or set of problems, and although it doesn’t get the press attention of flashy issues like hardware, big data, and the latest model architectures, it often means the difference between success and failure. The payoff of fine-tuning is often transformative: by properly harnessing feedback signals from users, especially at an enterprise scale, model performance can continue to grow significantly. Over time, this means greater and greater alignment with a company’s culture, values, and data differentiators—a truly customized model, locking in differentiators and harnessing their expertise.

But these benefits aren’t as easy to reap as one might assume. Fine-tuning is a skill unto itself—some might even call it a kind of artform—and determining the precise forms and quantities of data necessary to achieve it, along with the application of labor-intensive tasks like data labeling, hyperparameter tuning, and the interplay between varying tuning methods (LORA, Adaptor, full model fine-tune, or some mixture of the above) depends on experts with hard-won intuitions. Worse, the potential of a properly fine-tuned model is mirrored by pitfalls like model overfitting and catastrophic forgetting. Knowing how to avoid such outcomes is equally vital.

Enterprise AI Challenge #4: Problem Identification

It’s also worth noting that simply determining where and how AI can be applied—exactly—is as much a skill as any other. Matching an AI deployment with a real-world problem is not only among the most consequential decisions an organization can make but also the most unexpectedly complex. For example, can a given model reliably help enterprise users keep up with information overload by summarizing a swarm channel on Slack? What about segment generation, including the production of personalized content for a marketing team? These are just two of countless applications we’re seeing more and more of our customers embrace, many of whom didn’t realize they were possible until they were shown.

But the decision to deploy generative AI in the enterprise—even for a perfectly suited, clearly defined problem—is just the first of countless decisions on the way toward a successful deployment. Most fundamentally, how exactly should infrastructure be organized and allocated? After all, the accessible nature of open-source models doesn’t negate their need for significant hardware, with even modestly-sized models requiring substantial computational power, robust storage solutions, and high-speed connectivity. Additionally, one must explore a wide range of candidate models, varying by both size and architecture, as well as the role of fine-tuning to optimize model performance around the specific needs of its users. Whether directly or indirectly, all of these questions will play a role in determining the deployment’s ultimate, all-in cost to serve. Large models can often deliver transformative performance but incur steep—often prohibitive—infrastructure costs. In contrast, small models are often significantly more affordable and can approximate the performance of their larger counterparts when applied intelligently to narrower tasks, and even surpass them in some cases by leveraging optimally-focused training sets. But getting there can be much more of a technical challenge. And even this is just scratching the surface. Which architectures are best for a given application and optimization strategy? What about advanced techniques like RAG? These are essential questions, but—yet again—answering them requires a level of expertise that an open-source model can’t provide on its own. In other words, knowing how to apply AI is as essential as knowing where to apply it.

The applications of generative AI are nearly boundless, of course, from augmenting customer service (often to a transformative extent) to accelerating internal support teams and helping developers work faster. But so are the deployment considerations that go into making it all work in practice. As a company with years of experience at the forefront of this field, we’re constantly working to help our customers identify exactly where AI can make the most tangible difference for their organizations and their end users, and we’re only seeing this demand grow.

An even larger question: When open source models don’t apply

Finally, there’s the most fundamental challenge to open source of all, which is that such models generally represent a subset of what’s possible with generative AI, rather than its entirety. Even the best examples tend to trail the state of the art by small but meaningful margins (margins that can mean everything at an enterprise scale) and target the most common use cases. For companies who can’t accept less than bleeding-edge performance or simply face a sufficiently unique challenge, there’s no escaping the burden of building a truly bespoke model, from scratch, with full control over every step of the process from training to fine-tuning to deployment—either in house or by partnering with a team who can, like Salesforce. For businesses like these, open source is simply a non-starter.

Conclusion

Open source AI points to an amazing future, and one we’re all looking forward to. In fact, open source makes up a big part of our own strategy. But to reap the full power of AI—including the non-negotiable ability to trust it—we believe it takes the work of an organization like Salesforce, with our resources, experience, and reputation. That’s why, as powerful as this technology is on its own, it will always be the experience built around it, from one to the other, that delivers real the kind of value businesses can trust.

Special thanks to Alex Michael for his contributions to the writing of this piece.

A screenshot of the Agent Builder user interface

5 Cool Things to Try in Your Developer Edition Org

10 min read

How to Navigate a New Era of Agentic Customer Engagement

5 min read

Silvio Savarese Executive Vice President and Chief Scientist, Salesforce AI Research

Silvio Savarese is the Executive Vice President and Chief Scientist of Salesforce AI Research, as well as an Adjunct Faculty of Computer Science at Stanford University, where he served as an Associate Professor with tenure until winter 2021. At Salesforce, he shapes the scientific direction and Read More

More by Silvio

Caiming Xiong VP Salesforce Research

More by Caiming