Elasticsearch

Distributed search and analytics engine. Use Zeotap to bulk-index documents into Elasticsearch indices, keeping your search data in sync with your warehouse.

Prerequisites

An Elasticsearch cluster (version 7.x or later) accessible over HTTPS
An API key or username/password credentials with write permissions on the target index
The cluster endpoint URL (e.g., https://my-cluster.es.us-east-1.aws.found.io:9243)

Authentication

Elasticsearch supports two authentication methods. Choose the one that matches your cluster configuration.

API Key

Field	Type	Required	Description
API Key	Password	Yes	Base64-encoded API key (the `encoded` value from the Create API Key response)

To create an API key:

Open Kibana and navigate to Stack Management > API Keys
Click Create API key
Assign a name and set the appropriate role permissions (at minimum, write and create_index on the target index)
Copy the encoded value from the response — this is your API key

Basic Auth

Field	Type	Required	Description
Username	Text	Yes	Elasticsearch username
Password	Password	Yes	Elasticsearch password

Configuration

Field	Type	Required	Description
Endpoint URL	Text	Yes	The Elasticsearch cluster endpoint URL. Must start with `http://` or `https://`

Target Settings

Field	Type	Required	Description
Index Name	Text	Yes	The Elasticsearch index to write documents to. If the index does not exist, Elasticsearch auto-creates it on the first write

Supported Operations

Sync Modes

Mode	Supported	Description
Upsert	Yes	Creates new documents or fully replaces existing ones (uses the Elasticsearch `index` action)
Insert	Yes	Creates new documents only; fails if a document with the same ID already exists (uses the `create` action)
Update	Yes	Partially updates existing documents (uses the `update` action with `doc` merge)
Mirror	—	Not supported

Audience Sync Modes

Elasticsearch does not support audience sync modes. It has no list or segment membership API.

Features

Field Mapping: Yes — map source columns to Elasticsearch document fields
Schema Introspection: No — Elasticsearch indices accept dynamic mappings

Required Mapping Fields

There are no strictly required mapping fields. However, mapping a field to _id is strongly recommended for upsert and update modes so that Zeotap can address specific documents.

Default Destination Fields

Field	Type	Description
`_id`	string	Elasticsearch document ID. If mapped, used as the document `_id` for upserts and updates

How It Works

Zeotap writes data to Elasticsearch using the Bulk API:

Rows from the sync batch are converted to NDJSON (newline-delimited JSON) format
Each row becomes a two-line pair: an action/metadata line and a document body line
The action type depends on the sync mode:
- Upsert: {"index": {"_index": "my-index", "_id": "doc-123"}} followed by the full document
- Insert: {"create": {"_index": "my-index", "_id": "doc-123"}} followed by the full document
- Update: {"update": {"_index": "my-index", "_id": "doc-123"}} followed by {"doc": {...}}
Rows are sent in chunks of 500 documents per _bulk request
The Content-Type header is set to application/x-ndjson
Each chunk is sent with automatic retry on transient errors (429 Too Many Requests, 5xx)

Response Handling

The Bulk API returns per-item status in its response. Zeotap inspects each item:

2xx status: Document indexed successfully
4xx status: Permanent failure (e.g., mapping conflict, document already exists in insert mode). The row is marked as failed with the Elasticsearch error type and reason.
5xx status: Transient failure. The entire chunk is retried with exponential backoff.

Rate Limits

Elasticsearch does not impose fixed rate limits at the API level. Instead, each shard has a configurable number of bulk request slots (default: 200). When all slots are full, the cluster returns HTTP 429 (Too Many Requests).

Zeotap handles 429 responses with exponential backoff and automatic retry (up to 3 retries per chunk).

Recommended Batch Sizes

Small documents (< 1 KB): 500–2,000 documents per request
Medium documents (1–10 KB): 200–500 documents per request
Large documents (> 10 KB): 50–200 documents per request

Zeotap uses a default chunk size of 500 documents, which works well for typical use cases.

Best Practices

Map _id explicitly for upsert and update modes. Without a document ID, Elasticsearch auto-generates one, making updates impossible.
Create the index with explicit mappings before the first sync. While Elasticsearch auto-creates indices with dynamic mapping, explicit mappings give you control over field types and analyzers.
Use API key authentication in production. API keys can be scoped to specific indices and operations, following the principle of least privilege.
Monitor cluster health during large syncs. The Bulk API can put significant load on the cluster, especially with large documents or high throughput.
Use upsert mode for most use cases. It is the most forgiving — it creates documents that don’t exist and replaces those that do.
Avoid insert mode unless you specifically need uniqueness enforcement. Insert mode fails if the document already exists, which can cause high failure rates on re-syncs.

Troubleshooting

Authentication failed (401)

Verify your API key or username/password are correct. For API keys, ensure you are using the encoded value (Base64-encoded), not the raw id or api_key fields separately. Check that the API key has not been invalidated or expired.

Forbidden (403)

The authenticated user or API key lacks the required permissions. Ensure the credentials have write and create_index privileges on the target index. For API keys, check the role descriptors used when creating the key.

Index not found (404)

If using insert or update mode, the index must exist before writing. Create the index manually or switch to upsert mode, which triggers auto-creation.

Mapper parsing exception

A field value does not match the index mapping. For example, sending a string to a field mapped as integer. Check the index mapping and ensure your source data types are compatible. Consider using explicit field mappings in Zeotap to cast or rename fields.

Version conflict engine exception

This occurs in update mode when there is a concurrent write to the same document. Zeotap retries these automatically. If conflicts persist, check for other processes writing to the same index.

Circuit breaker exception

The cluster is running low on memory. Reduce the sync batch size, add more nodes to the cluster, or increase the JVM heap size. This is a cluster-capacity issue, not a Zeotap issue.

Connection timeout

The Elasticsearch cluster is unreachable or slow to respond. Verify the endpoint URL, check network connectivity, and ensure the cluster is healthy. For cloud-hosted clusters (Elastic Cloud, AWS OpenSearch), verify the cluster has not been paused or terminated.

Too many requests (429)

The cluster’s bulk queue is full. Zeotap retries with exponential backoff automatically. If 429 errors persist, consider reducing the sync frequency, increasing the cluster’s thread pool queue size, or scaling the cluster.

Amazon DynamoDB Google Sheets