What is a Data Warehouse? 4 Types + Key Features

A data warehouse is an enterprise system that aggregates structured data from sources like CRM, POS transactions, and marketing into a unified repository.

A data warehouse is an enterprise data management system that collates structured data from multiple sources, such as your customer relationship management software, point-of-sale transactions, and marketing efforts, into a single, unified data repository.

Data warehouses support better business decision-making and facilitate business intelligence (BI) activities by acting as a single source of truth. They also have key analytical capabilities, enabling businesses to discover valuable insights from the data they possess.

In the past, data warehouses were usually hosted on-premises on enormous mainframe computers, but this made it challenging to scale. Today, most data warehouses are hosted in the cloud, providing better scalability and cost-effectiveness.

Data Cloud platform in a ssot dashboard

Say hello to Data Cloud.

Data Cloud, the only data platform native to Salesforce, unlocks and harmonises data from any system — so you can better understand your customers and drive growth.

Data warehouse (OLAP) vs database (OLTP)

A database is a data collection that’s optimised for data storage and processing, and it’s best suited to real-time Online Transaction Processing (OLTP), with a focus on day-to-day operations. It can quickly handle basic requests and transactions, like adding, updating or retrieving real-time data.

By contrast, a data warehouse is designed exclusively for data analysis through Online Analytical Processing. It aggregates, queries, and analyses enormous quantities of historical data, helping businesses uncover insights and make smarter decisions.

To build on this, let’s explore the key differences.

Key features Data warehouse (OLAP) Database (OLTP)
Main purpose Analysing aggregated data from various sources Processing transactional data
Use cases Sales analysis, evaluating marketing performance, financial forecasting Processing orders, managing customer details, updating stock levels
Data formatting Multidimensional data modelling focussing on data from multiple angles Unidimensional data, organising data into simple tables
Data architecture Data read operations, optimised to analyse a large volume of data Data write operations, optimised to update transactional data in real-time
Performance times Usually minutes, depending on the data being analysed In real-time, usually in milliseconds
Scalability High scalability Limited scalability
Useful for Business analysts, data experts, key decision-makers Operational staff, sales reps, end-users

To put this into perspective, imagine you’re running a coffee shop. When a customer orders their mocha through an app, your point-of-sale system processes the purchase, updates your inventory and generates a receipt in real time. This is facilitated by an OLTP database.

However, when it comes to analysing your monthly sales data to uncover trends, such as which drinks sell the best during the holidays, you’ll need OLAP systems (data warehouses) to handle that kind of BI activity.

Data warehouses vs databases vs data lakes

Unlike a database and data warehouse, a data lake ingests data in its original form. It supports the ability to store data without needing it cleaned or transformed before it’s ingested.

This means data lakes can store a wider variety of data, both relational and non-relational, meaning they’re typically used to allow organisations to explore enormous data sets. For instance, they commonly help data scientists train and deploy machine learning (ML) models. They’re also a key player in big data analytics.

Expanding on this, let's take a closer look at the key differences.

Key features Data warehouse Database Data lake
Main purpose Analysing aggregated data Transactional processing of data Exploring big data
Use cases Sales analysis, evaluating marketing performance, financial forecasting Processing orders, managing customer details, updating stock levels Running machine learning (ML models), exploratory data analysis
Data structure Primarily structured and semi-structured, though modern data warehouses can store some unstructured data Structured Structured, semi-structured and unstructured
Performance times Usually minutes Milliseconds (real-time) Minutes to hours depending on data size
Scalability High scalability Limited scalability High scalability
Useful for Business analysts, data experts, key decision makers Operational staff, sales reps, end-users Data scientists, business analysts, artificial intelligence and machine learning teams

To get the best of both worlds, some organisations are now experimenting with data lakehouses. These platforms combine the data lake’s ability to ingest raw data with the data warehouse’s high performance, creating a scalable and highly agile solution.

data culture playbook connecting data and ai

Gartner named Salesforce a Leader in Customer Data Platforms. See why.

What are the benefits of using a data warehouse?

Data warehouses empower organisations to aggregate and analyse vast amounts of data, unlocking a wealth of benefits. Here are a few highlights.

  • Improved data quality: A data warehouse collates disparate data. In the process, it cleans and standardises the information, creating high-quality, reliable data sets that businesses can trust as a single source of truth.
  • Improved decision-making: As data warehouses provide insights on specific trends and themes, organisations can use them to make more informed, evidence-based decisions.
  • More comprehensive insights: Data warehouses integrate with BI tools to create dashboards and analyst reports, helping businesses, experts, and stakeholders visualise their data and act on key findings faster.
  • Flexibility: Data warehouses also have enough agility to hone in on specific themes. If a business wants to prioritise improving inventory management, for instance, a data warehouse can analyse relevant data to uncover highly specific insights.

Lastly, a data warehouse delivers all of these benefits in less time, allowing businesses to remain agile in the face of trends. This also frees up time for data analysts to focus on drawing actionable insights from data rather than wasting time on tedious processing.

The key features of a data warehouse

A modern data warehouse architecture isn’t just a filing cabinet for your data. It includes several essential components that deliver extraordinary value for organisations.

  • Data sources: This is the trusted data that the data integration tools collect and transform. It could come from any number of sources, such as CRMs, SQL databases, point-of-sale systems, IoT devices, and spreadsheets.
  • Data staging area: The staging area is an intermediary space where data is collated, cleansed, and transformed into a structured, consistent format.
  • Data integration and data transformation: Data warehouses also feature data integration tools such as ETL (extract, transform, load) and ELT to ingest, modify, and aggregate data. This process also aligns the data integrated into the warehouse so it is prepared for bulk analysis.
  • Central database: A relational database is the repository of the data warehouse. It stores all of the data and maintains historical records for future use.
  • Data marts: Some data warehouses feature data marts, which are smaller data warehouses that hold specific data subsets related to a particular department or operation. For instance, a data mart could hold all relevant sales data, allowing sales reps to identify and qualify leads.
  • Metadata: The metadata provides information about how the data should be structured and formatted within the data warehouse. This component is crucial for ensuring compliance with data governance regulations.
  • Analytical tools: You’ll also find access tools that enable users to analyse data, extract insights, and visualise the information with dashboards, charts, and graphs.
  • Sandbox: Many data warehouses also feature a sandbox — a separate environment used for testing. It allows data analysts to experiment with data, run queries, and develop models without impacting the main warehouse or affecting other users.

In sophisticated data warehouses, you’ll also find advanced analytical capabilities, often powered by artificial intelligence algorithms. These algorithms can enhance data analysis, identify patterns, and generate insights that inform decision-making.

data cloud professional service consultant guidance image

Get expert Data Cloud guidance from Salesforce Professional Services.

With 1.3K+ certified Data Cloud consultants and 240+ implementations globally, we'll help you realise value quickly. Check out our guide to learn how.

What are the different types of data warehouses?

There are four main types of data warehouses, each suited to different business needs. Each type offers distinct advantages in terms of control, scalability, and management.

1. On-premises data warehouses

Some businesses prefer to build their data warehouse on their own premises. This provides complete control over security and governance, but it’s often more expensive and can offer little in the way of scalability.

2. Cloud data warehouses

A cloud data warehouse operates entirely on the cloud. These warehouses are run and deployed by cloud service providers that manage the service on behalf of the business. This means there’s no requirement for on-premises hardware.

3. Data warehouse appliances

This plug-and-play model involves hardware and software that a business connects to its existing network. It’s a more flexible option than an on-premises data warehouse but not as easy to manage as cloud warehouses.

4. Hybrid data warehouses

This setup is ideal for businesses that require a blend of control over their data while leveraging the scalability and cost benefits of the cloud, such as enterprises with strict security or compliance needs.

How do data warehouses function in the cloud?

Traditional data warehouses were typically hosted on-premises. But these days, on-site architectures are rarely elastic enough to support modern business needs. The global big data analytics market is predicted to reach over US$924 billionOpens in a new window by 2032. On-premise solutions are simply too difficult to scale up as technologies advance and business priorities shift.

As such, many businesses now opt for data warehouses with cloud architectures. These warehouses offer the same characteristics as on-premises solutions but with the added advantage of being fully cloud-native with cloud storage. They’re flexible, secure, and can be scaled up or down depending on a business’s needs. They’re also typically more cost-effective and predictable when it comes to pricing.

Another core benefit of cloud data warehouses is the ability to integrate with artificial intelligence (AI) and machine learning (ML). For instance, you could connect an AI model to your data warehouse to identify opportunities to improve customer retention or link it to marketing data to generate actionable insights for enhancing campaign performance.

Typically, cloud-based data warehouses are hosted and fully managed by cloud service providers, making it easier for businesses of all sizes to benefit from their data and gain actionable insights.

Best practices for setting up a data warehouse

Building a data warehouse requires careful planning and execution. These best practices will help you ensure a smooth and successful setup, whether you’re starting from scratch or optimising an existing system.

  • Know your goal: Define what you’re hoping to achieve with your data analytics, whether that be enhancing decision-making, delivering more personalised customer experiences, or any other variable. Involve your stakeholders across departments at this stage.
  • Assess your data: Analyse your existing data sources to evaluate quality and relevance to your needs. You should particularly examine gaps in your data catalogue to identify priorities for integration.
  • Define security and governance policies: Outline clear policies for data accuracy and integrity. Implement access controls as needed, as well as compliance monitoring to protect sensitive data and stay on the right side of legislation.
  • Plan for expansion: Remember to bake scalability into your data warehouse design. You need to have enough flexibility to adapt as technology advances, consumer needs change, and your business grows.
  • Choose the right technology: Decide between on-premises or cloud warehousing solutions based on your needs for scalability and cost efficiency. If you opt for a cloud platform, choose a solution with strong analytics capabilities, AI integrations, and flexibility for upgrades.
  • Train your team: Ensure your team understands the implications and use cases of a data warehouse. If needed, bring together specialists who are familiar with data analysis, BI tools, and ETL solutions.

Summing up

Companies today collect more data than ever before, but most of that data is trapped in disconnected siloes. A cloud data warehouse gives you the power to aggregate and analyse the data you possess, uncover insights and make more informed business decisions.

Salesforce Data Cloud unifies siloed business data into a single platform, providing every team member with a 360-degree view of the customer. This unified data enables real-time analytics, automation, personalised customer experiences, and the use of trusted AI models to drive insights and decision-making. Watch a demo today to learn more.

FAQs

A database scheme is a blueprint that defines how your data warehouse organises and combines data. They indicate how data is stored in tables and how those tables relate to each other.

Star schemas consist of one central fact table that contains quantitative data, such as sales or revenue. This table is surrounded by dimension tables that have descriptive attributes. Star schemas are the simplest, most common type of schema in comparison to the snowflake schema, which is comparatively more complex and less widely used.

Extract Load Transform (ELT) is a data integration process used to prepare data for analysis. Unlike Extract, Transform, Load (ETL), ELT loads the data first before transforming it, leveraging the warehouse’s processing power to do so. ELT is better suited to cloud systems, whereas ETL is more common with legacy systems, where data needs to be transformed before it can efficiently be loaded.

A data mart is much smaller and more focused than a data warehouse. It’s essentially a specialised repository tailored to a specific department, like sales or marketing. It segregates important data for a particular team so they can access it and get targeted insights more easily than if they had to navigate the broader warehouse.