DestinationsOtherElasticsearch

Elasticsearch

Distributed search and analytics engine. Use Zeotap to bulk-index documents into Elasticsearch indices, keeping your search data in sync with your warehouse.

Prerequisites

  • An Elasticsearch cluster (version 7.x or later) accessible over HTTPS
  • An API key or username/password credentials with write permissions on the target index
  • The cluster endpoint URL (e.g., https://my-cluster.es.us-east-1.aws.found.io:9243)

Authentication

Elasticsearch supports two authentication methods. Choose the one that matches your cluster configuration.

API Key

FieldTypeRequiredDescription
API KeyPasswordYesBase64-encoded API key (the encoded value from the Create API Key response)

To create an API key:

  1. Open Kibana and navigate to Stack Management > API Keys
  2. Click Create API key
  3. Assign a name and set the appropriate role permissions (at minimum, write and create_index on the target index)
  4. Copy the encoded value from the response — this is your API key

Basic Auth

FieldTypeRequiredDescription
UsernameTextYesElasticsearch username
PasswordPasswordYesElasticsearch password

Configuration

FieldTypeRequiredDescription
Endpoint URLTextYesThe Elasticsearch cluster endpoint URL. Must start with http:// or https://

Target Settings

FieldTypeRequiredDescription
Index NameTextYesThe Elasticsearch index to write documents to. If the index does not exist, Elasticsearch auto-creates it on the first write

Supported Operations

Sync Modes

ModeSupportedDescription
UpsertYesCreates new documents or fully replaces existing ones (uses the Elasticsearch index action)
InsertYesCreates new documents only; fails if a document with the same ID already exists (uses the create action)
UpdateYesPartially updates existing documents (uses the update action with doc merge)
MirrorNot supported

Audience Sync Modes

Elasticsearch does not support audience sync modes. It has no list or segment membership API.

Features

  • Field Mapping: Yes — map source columns to Elasticsearch document fields
  • Schema Introspection: No — Elasticsearch indices accept dynamic mappings

Required Mapping Fields

There are no strictly required mapping fields. However, mapping a field to _id is strongly recommended for upsert and update modes so that Zeotap can address specific documents.

Default Destination Fields

FieldTypeDescription
_idstringElasticsearch document ID. If mapped, used as the document _id for upserts and updates

How It Works

Zeotap writes data to Elasticsearch using the Bulk API:

  1. Rows from the sync batch are converted to NDJSON (newline-delimited JSON) format
  2. Each row becomes a two-line pair: an action/metadata line and a document body line
  3. The action type depends on the sync mode:
    • Upsert: {"index": {"_index": "my-index", "_id": "doc-123"}} followed by the full document
    • Insert: {"create": {"_index": "my-index", "_id": "doc-123"}} followed by the full document
    • Update: {"update": {"_index": "my-index", "_id": "doc-123"}} followed by {"doc": {...}}
  4. Rows are sent in chunks of 500 documents per _bulk request
  5. The Content-Type header is set to application/x-ndjson
  6. Each chunk is sent with automatic retry on transient errors (429 Too Many Requests, 5xx)

Response Handling

The Bulk API returns per-item status in its response. Zeotap inspects each item:

  • 2xx status: Document indexed successfully
  • 4xx status: Permanent failure (e.g., mapping conflict, document already exists in insert mode). The row is marked as failed with the Elasticsearch error type and reason.
  • 5xx status: Transient failure. The entire chunk is retried with exponential backoff.

Rate Limits

Elasticsearch does not impose fixed rate limits at the API level. Instead, each shard has a configurable number of bulk request slots (default: 200). When all slots are full, the cluster returns HTTP 429 (Too Many Requests).

Zeotap handles 429 responses with exponential backoff and automatic retry (up to 3 retries per chunk).

  • Small documents (< 1 KB): 500–2,000 documents per request
  • Medium documents (1–10 KB): 200–500 documents per request
  • Large documents (> 10 KB): 50–200 documents per request

Zeotap uses a default chunk size of 500 documents, which works well for typical use cases.

Best Practices

  • Map _id explicitly for upsert and update modes. Without a document ID, Elasticsearch auto-generates one, making updates impossible.
  • Create the index with explicit mappings before the first sync. While Elasticsearch auto-creates indices with dynamic mapping, explicit mappings give you control over field types and analyzers.
  • Use API key authentication in production. API keys can be scoped to specific indices and operations, following the principle of least privilege.
  • Monitor cluster health during large syncs. The Bulk API can put significant load on the cluster, especially with large documents or high throughput.
  • Use upsert mode for most use cases. It is the most forgiving — it creates documents that don’t exist and replaces those that do.
  • Avoid insert mode unless you specifically need uniqueness enforcement. Insert mode fails if the document already exists, which can cause high failure rates on re-syncs.

Troubleshooting

Authentication failed (401)

Verify your API key or username/password are correct. For API keys, ensure you are using the encoded value (Base64-encoded), not the raw id or api_key fields separately. Check that the API key has not been invalidated or expired.

Forbidden (403)

The authenticated user or API key lacks the required permissions. Ensure the credentials have write and create_index privileges on the target index. For API keys, check the role descriptors used when creating the key.

Index not found (404)

If using insert or update mode, the index must exist before writing. Create the index manually or switch to upsert mode, which triggers auto-creation.

Mapper parsing exception

A field value does not match the index mapping. For example, sending a string to a field mapped as integer. Check the index mapping and ensure your source data types are compatible. Consider using explicit field mappings in Zeotap to cast or rename fields.

Version conflict engine exception

This occurs in update mode when there is a concurrent write to the same document. Zeotap retries these automatically. If conflicts persist, check for other processes writing to the same index.

Circuit breaker exception

The cluster is running low on memory. Reduce the sync batch size, add more nodes to the cluster, or increase the JVM heap size. This is a cluster-capacity issue, not a Zeotap issue.

Connection timeout

The Elasticsearch cluster is unreachable or slow to respond. Verify the endpoint URL, check network connectivity, and ensure the cluster is healthy. For cloud-hosted clusters (Elastic Cloud, AWS OpenSearch), verify the cluster has not been paused or terminated.

Too many requests (429)

The cluster’s bulk queue is full. Zeotap retries with exponential backoff automatically. If 429 errors persist, consider reducing the sync frequency, increasing the cluster’s thread pool queue size, or scaling the cluster.