Skip to Content
GuidesWarehouse-First CDP

Building a Warehouse-First CDP

This guide walks you through building a complete Customer Data Platform using Zeotap with your data warehouse as the single source of truth. By the end, you’ll have a full pipeline from raw warehouse data to activated audiences in your marketing and sales tools.

Philosophy: Your Warehouse Is the Source of Truth

In a warehouse-first CDP, your data warehouse (Snowflake, BigQuery, or Databricks) is the central hub for all customer data. Instead of copying data into a proprietary CDP, Zeotap generates and executes SQL directly against your warehouse. This means:

  • No data copies — Customer data stays where it is. Computed attributes, audiences, and models are all computed as SQL queries in your warehouse.
  • SQL-native — Every computation is transparent SQL that your data team can review, version, and audit.
  • Full warehouse power — Leverage your warehouse’s compute, scale, and query engine rather than a black-box CDP engine.
  • Complete audit trail — Every query, every sync run, every audience evaluation is logged with timestamps and row counts.
  • Single source of truth — No reconciliation between “CDP data” and “warehouse data” because they’re the same thing.

Prerequisites

  • A Zeotap workspace (create one here)
  • A data warehouse (Snowflake, BigQuery, or Databricks) with customer data tables
  • Admin or Editor role in your Zeotap workspace

Step 1: Connect Your Warehouse

Warehouses give Zeotap read access to your data. Zeotap never writes to your source tables (except for materialized computed-attribute results in a dedicated schema).

  1. Navigate to Warehouses in the left sidebar
  2. Click Add Warehouse
  3. Select your warehouse type
  4. Enter your connection credentials:
    • Snowflake: Account, Warehouse, Database, Schema, Username, Password or Key Pair
    • BigQuery: Project ID, Dataset, Service Account JSON
    • Databricks: Host, HTTP Path, Personal Access Token, Catalog, Schema
  5. Click Test Connection — Zeotap will execute a lightweight SELECT 1 query to verify connectivity
  6. Click Save

Best practice: Create a dedicated read-only service account for Zeotap. Grant SELECT access only to the schemas and tables that contain customer data. Follow the principle of least privilege.

For detailed setup instructions per warehouse, see:

Step 2: Build Data Models for Customer Entities

Models are SQL queries that shape your raw warehouse tables into the data structures Zeotap works with. Think of them as views that define “this is what a customer looks like” or “these are all the orders.”

  1. Navigate to Models in the left sidebar
  2. Click Create Model
  3. Select the warehouse you connected in Step 1
  4. Write a SQL query that selects customer data:
-- Example: Customer master model SELECT c.customer_id, c.email, c.phone, c.first_name, c.last_name, c.created_at, c.country, c.company_name, COALESCE(o.total_orders, 0) AS total_orders, COALESCE(o.total_revenue, 0) AS total_revenue, o.last_order_date FROM customers c LEFT JOIN ( SELECT customer_id, COUNT(*) AS total_orders, SUM(order_total) AS total_revenue, MAX(order_date) AS last_order_date FROM orders GROUP BY customer_id ) o ON c.customer_id = o.customer_id WHERE c.email IS NOT NULL
  1. Click Preview to see sample results and verify the data shape
  2. Configure column types — mark customer_id and email as identifiers, the rest as attributes
  3. Name the model (e.g., “Customer Master”) and click Save

Tip: Create multiple models for different entity types — one for customers, one for accounts, one for products. This keeps each model focused and composable.

Step 3: Define Your Relationships with Entity Types

Relationships give Zeotap a structured understanding of your customer data. Entity types represent the core objects in your business (User, Account, Product, etc.), and relationships define how they connect.

  1. Navigate to Relationships in the left sidebar
  2. Click Create Entity Type
  3. Name it (e.g., “User”) and provide a description
  4. Map attributes from your models to this entity type:
    • Identifiers — Fields that uniquely identify an entity (e.g., customer_id, email, phone). These are used for identity resolution and destination matching.
    • Attributes — Descriptive fields (e.g., first_name, country, company_name)
  5. Click Save

If your data has multiple entity types, create them and define relationships:

Entity relationship examples

Relationships enable cross-entity audience conditions like “Users whose Account has more than 100 employees.”

Step 4: Create Computed Attributes from Warehouse Data

Computed attributes transform raw data into actionable metrics. They’re evaluated as SQL against your warehouse and materialized for fast audience building.

  1. Navigate to Computed Attributes under Sources in the left sidebar
  2. Click Create Computed Attribute
  3. Choose the computed attribute type:

SQL Computed Attribute — For custom calculations:

-- Lifetime value SELECT customer_id, SUM(order_total) - COALESCE(SUM(refund_amount), 0) AS lifetime_value FROM orders LEFT JOIN refunds USING (order_id) GROUP BY customer_id

Aggregation Computed Attribute — For common metrics (no SQL needed):

  • Function: Count
  • Source table: orders
  • Group by: customer_id
  • Result: “Total Order Count”

Formula Computed Attribute — To combine existing computed attributes:

average_order_value = total_revenue / total_orders is_high_value = lifetime_value > 1000 days_since_last_order = DATEDIFF('day', last_order_date, CURRENT_DATE())
  1. Set a schedule (daily is a good starting point for most computed attributes)
  2. Click Save

Recommended starting computed attributes:

Computed Attribute NameTypeDescription
Lifetime ValueSQLTotal spend minus refunds
Total OrdersAggregationCount of all orders
Average Order ValueFormulaLifetime value / total orders
Days Since Last PurchaseSQLDays between last order and today
Engagement ScoreSQLComposite score from login frequency, feature usage, etc.
Account SizeAggregationNumber of users per account

Step 5: Build Audiences

Audiences are groups of customers defined by conditions on computed attributes and attributes. The filter builder lets you construct complex conditions with AND/OR logic.

  1. Navigate to Audiences under Activation in the left sidebar
  2. Click Create Audience
  3. Select the entity type (e.g., “User”)
  4. Build your conditions using the filter builder:

Example: High-Value At-Risk Customers

WHERE lifetime_value > 500 AND days_since_last_purchase > 60 AND total_orders >= 3
  1. Click Estimate to see approximate audience size
  2. Click Preview to see sample members and verify the audience makes sense
  3. Name the audience and click Save

Useful starter audiences:

AudienceConditionsUse Case
High-Value Customerslifetime_value > 500VIP programs, premium support
Churn Riskdays_since_last_purchase > 60 AND total_orders >= 2Re-engagement campaigns
New Users (7 days)created_at > 7 days agoOnboarding sequences
Big Accountsaccount_size > 50Enterprise sales outreach
Engaged but Not Purchasedengagement_score > 80 AND total_orders = 0Conversion campaigns

Step 6: Activate to Destinations

Now connect your audiences to the tools where your team takes action — CRMs, ad platforms, email tools, and more.

Set Up a Destination

  1. Navigate to Connections under Destinations in the left sidebar
  2. Click Add Destination
  3. Select the destination type (e.g., Salesforce, Google Ads, Braze)
  4. Complete the authentication flow
  5. Click Save

Create a Sync

  1. Navigate to Syncs under Destinations
  2. Click Create Sync
  3. Select the audience you built in Step 5
  4. Select the destination
  5. Choose a sync mode:
    • Mirror — Keeps the destination in perfect sync (adds new members, removes exited members). Best for most use cases.
    • Additive — Only adds new members, never removes. Good for building cumulative lists.
    • Subtractive — Only removes members who no longer qualify. Good for suppression lists.
  6. Map fields — select which computed attributes and attributes to send alongside membership
  7. Set a schedule (e.g., every 6 hours)
  8. Click Save & Run

Monitor the Sync

  1. Go to the Sync Runs tab on the sync detail page
  2. Review the run summary: members added, removed, updated, and any errors
  3. Check Insights > Sync Health for ongoing monitoring across all syncs

Benefits of the Warehouse-First Approach

Data Stays in Your Warehouse

Customer data never leaves your infrastructure. Zeotap reads data via SQL queries and writes only activation payloads (membership lists, mapped fields) to destinations. Your warehouse remains the authoritative store.

SQL-Native Transparency

Every computed attribute, audience, and model is expressed as SQL. Your data team can review the exact queries being run, version them in your own systems, and extend them with warehouse-specific functions.

Complete Audit Trail

Every sync run, computed attribute evaluation, and audience computation is logged with timestamps, row counts, and error details. Use Insights dashboards or query the audit tables directly in your warehouse.

No Vendor Lock-In

Because your data stays in your warehouse and all logic is expressed as SQL, there’s no proprietary data format or opaque scoring engine to migrate away from. If you stop using Zeotap, your data is exactly where it always was.

Warehouse-Scale Performance

Audience evaluation, computed-attribute computation, and data preparation all happen on your warehouse’s compute engine. You get the full power of Snowflake, BigQuery, or Databricks — not a scaled-down CDP query engine.

Next Steps

Last updated on