Models

Models are SQL queries that define the shape and content of the data you want to sync from your warehouse to your destinations. They are the core abstraction in Zeotap’s data pipeline — connecting your raw warehouse data to the activation layer.

What Is a Model?

A Model is a saved SQL query paired with column configuration. It runs against a Warehouse (your data warehouse) and produces a result set that can be consumed by Reverse ETL, Relationships mappings, and the Activation layer (Audiences, Computed Attributes, A/B Tests, and more).

Think of models as views into your warehouse data that are specifically designed for activation. Unlike raw tables, models let you:

Join and transform data from multiple tables into a single, clean result set
Filter rows to include only relevant records (e.g., active customers, recent orders)
Rename and reshape columns to match destination requirements
Tag columns with semantic types (identifier, attribute) so Zeotap knows how to use them

Why Models Matter

Models are the bridge between your warehouse and every downstream system. Without them, you would need to:

Write custom integrations for each destination
Manage schema mapping in every sync
Duplicate transformation logic across pipelines

With models, you define the transformation once and reuse it across any number of syncs and destinations.

Model with SQL query and column config feeding multiple syncs

Model Components

Every model consists of three parts:

1. Source Selection

Each model is connected to exactly one Warehouse. The SQL query runs against the warehouse configured in that source. You select the source when creating the model.

2. SQL Query

The heart of the model is a SQL query written in the dialect of your source warehouse (Snowflake SQL, BigQuery SQL, etc.). The query can include:

SELECT statements with any columns, expressions, or aggregations
JOIN across multiple tables (even across schemas, if permitted)
WHERE filters to limit rows
GROUP BY, HAVING, ORDER BY, and LIMIT clauses
CTEs (WITH clauses) for complex transformations
Window functions for ranking, running totals, etc.

See the SQL Editor guide for details on the editing experience.

3. Column Configuration

After writing and previewing your SQL, you configure each output column:

Column type — Is it an identifier (used to match records) or an attribute (data to sync)?
Primary key — Which column(s) uniquely identify each row?
Data type — What type of data does the column contain?

See Column Configuration for the full guide.

Model Lifecycle

1. Create

Write a SQL query, preview the results, and configure columns. The model is saved but not yet actively syncing data.

2. Use in Reverse ETL

Connect the model to one or more Reverse ETL syncs. Each sync maps the model’s columns to a destination’s fields and runs on a schedule.

3. Use in Relationships

Map the model’s columns to entity types and relationships in the Relationships module to build a unified data model.

4. Edit

Update the SQL query or column configuration as your data needs evolve. Changes take effect on the next sync run.

5. Archive

When a model is no longer needed, archive or delete it. Zeotap prevents deletion if active syncs depend on the model.

API Reference

Models are managed through the Zeotap REST API:


# List all models
GET /api/v1/models
 
# Get a single model
GET /api/v1/models/{id}
 
# Create a model
POST /api/v1/models
 
# Update a model
PUT /api/v1/models/{id}
 
# Delete a model
DELETE /api/v1/models/{id}
 
# Preview model results
POST /api/v1/models/{id}/preview

Example: Create a Model


curl -X POST https://composable.zeotap.com/api/v1/models \
  -H "Authorization: Bearer $API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Active Customers",
    "source_id": "src_abc123",
    "query": "SELECT id, email, name, lifetime_value FROM customers WHERE status = '\''active'\''",
    "columns": [
      {"name": "id", "type": "identifier", "primary_key": true},
      {"name": "email", "type": "identifier"},
      {"name": "name", "type": "attribute"},
      {"name": "lifetime_value", "type": "attribute"}
    ]
  }'

Best Practices

Keep queries focused — Each model should serve a specific purpose (e.g., “Active Customers for CRM Sync” rather than “All Customer Data”)
Use CTEs for readability — Break complex queries into named steps with WITH clauses
Always include a primary key — Every model needs at least one column marked as the primary key for syncs to function correctly
Preview before saving — Use the preview feature to verify your query returns the expected data
Name models descriptively — Use names that describe both the data and the intended use case
Avoid SELECT * — Explicitly list the columns you need to avoid syncing unnecessary data and to maintain stability when the source table schema changes
Filter early — Apply WHERE clauses to reduce the data scanned and improve query performance