OpenSearch

Open-source search and analytics engine. Use Zeotap to bulk-index documents into OpenSearch indices, keeping your search data in sync with your warehouse.

Prerequisites

An OpenSearch cluster (version 1.x or later) accessible over HTTPS
An API key or username/password credentials with write permissions on the target index
The cluster endpoint URL (e.g., https://my-cluster.us-east-1.es.amazonaws.com)

Authentication

OpenSearch supports two authentication methods. Choose the one that matches your cluster configuration.

API Key

Field	Type	Required	Description
API Key	Password	Yes	Base64-encoded API key (the `encoded` value from the Create API Key response)

To create an API key:

Open OpenSearch Dashboards and navigate to Security > Auth Tokens
Click Create API key (requires the Security plugin to be enabled)
Assign a name and set the appropriate permissions (at minimum, write access on the target index)
Copy the encoded value from the response — this is your API key

Basic Auth

Field	Type	Required	Description
Username	Text	Yes	OpenSearch username
Password	Password	Yes	OpenSearch password

Use the credentials configured in your OpenSearch Security plugin. For Amazon OpenSearch Service, use the master user credentials or fine-grained access control credentials.

Configuration

Field	Type	Required	Description
Endpoint URL	Text	Yes	The OpenSearch cluster endpoint URL. Must start with `http://` or `https://`

Target Settings

Field	Type	Required	Description
Index Name	Text	Yes	The OpenSearch index to write documents to. If the index does not exist, OpenSearch auto-creates it on the first write

Supported Operations

Sync Modes

Mode	Supported	Description
Upsert	Yes	Creates new documents or fully replaces existing ones (uses the OpenSearch `index` action)
Insert	Yes	Creates new documents only; fails if a document with the same ID already exists (uses the `create` action)
Update	Yes	Partially updates existing documents (uses the `update` action with `doc` merge)
Mirror	—	Not supported

Audience Sync Modes

OpenSearch does not support audience sync modes. It has no list or segment membership API.

Features

Field Mapping: Yes — map source columns to OpenSearch document fields
Schema Introspection: No — OpenSearch indices accept dynamic mappings

Required Mapping Fields

There are no strictly required mapping fields. However, mapping a field to _id is strongly recommended for upsert and update modes so that Zeotap can address specific documents.

Default Destination Fields

Field	Type	Description
`_id`	string	OpenSearch document ID. If mapped, used as the document `_id` for upserts and updates

How It Works

Zeotap writes data to OpenSearch using the Bulk API :

Rows from the sync batch are converted to NDJSON (newline-delimited JSON) format
Each row becomes a two-line pair: an action/metadata line and a document body line
The action type depends on the sync mode:
- Upsert: {"index": {"_index": "my-index", "_id": "doc-123"}} followed by the full document
- Insert: {"create": {"_index": "my-index", "_id": "doc-123"}} followed by the full document
- Update: {"update": {"_index": "my-index", "_id": "doc-123"}} followed by {"doc": {...}}
Rows are sent in chunks of 500 documents per _bulk request
The Content-Type header is set to application/x-ndjson
Each chunk is sent with automatic retry on transient errors (429 Too Many Requests, 5xx)

Response Handling

The Bulk API returns per-item status in its response. Zeotap inspects each item:

2xx status: Document indexed successfully
4xx status: Permanent failure (e.g., mapping conflict, document already exists in insert mode). The row is marked as failed with the OpenSearch error type and reason.
5xx status: Transient failure. The entire chunk is retried with exponential backoff.

Rate Limits

OpenSearch does not impose fixed rate limits at the API level. Instead, each shard has a configurable number of bulk request slots. When all slots are full, the cluster returns HTTP 429 (Too Many Requests).

Zeotap handles 429 responses with exponential backoff and automatic retry (up to 3 retries per chunk).

Recommended Batch Sizes

Small documents (< 1 KB): 500–2,000 documents per request
Medium documents (1–10 KB): 200–500 documents per request
Large documents (> 10 KB): 50–200 documents per request

Zeotap uses a default chunk size of 500 documents, which works well for typical use cases.

Best Practices

Map _id explicitly for upsert and update modes. Without a document ID, OpenSearch auto-generates one, making updates impossible.
Create the index with explicit mappings before the first sync. While OpenSearch auto-creates indices with dynamic mapping, explicit mappings give you control over field types and analyzers.
Use Basic Auth or API key authentication depending on your cluster setup. For Amazon OpenSearch Service, use fine-grained access control with a dedicated user for sync operations.
Monitor cluster health during large syncs. The Bulk API can put significant load on the cluster, especially with large documents or high throughput.
Use upsert mode for most use cases. It is the most forgiving — it creates documents that don’t exist and replaces those that do.
Avoid insert mode unless you specifically need uniqueness enforcement. Insert mode fails if the document already exists, which can cause high failure rates on re-syncs.

Troubleshooting

Authentication failed (401)

Verify your API key or username/password are correct. For API keys, ensure you are using the encoded value (Base64-encoded), not the raw id or api_key fields separately. For Amazon OpenSearch Service, verify that your fine-grained access control credentials are correct and the master user has not been changed.

Forbidden (403)

The authenticated user or API key lacks the required permissions. Ensure the credentials have write privileges on the target index. For Amazon OpenSearch Service, check the access policy and fine-grained access control role mappings.

Index not found (404)

If using insert or update mode, the index must exist before writing. Create the index manually or switch to upsert mode, which triggers auto-creation.

Mapper parsing exception

A field value does not match the index mapping. For example, sending a string to a field mapped as integer. Check the index mapping and ensure your source data types are compatible. Consider using explicit field mappings in Zeotap to cast or rename fields.

Version conflict engine exception

This occurs in update mode when there is a concurrent write to the same document. Zeotap retries these automatically. If conflicts persist, check for other processes writing to the same index.

Circuit breaker exception

The cluster is running low on memory. Reduce the sync batch size, add more nodes to the cluster, or increase the JVM heap size. This is a cluster-capacity issue, not a Zeotap issue.

Connection timeout

The OpenSearch cluster is unreachable or slow to respond. Verify the endpoint URL, check network connectivity, and ensure the cluster is healthy. For Amazon OpenSearch Service, verify the domain has not been paused or deleted, and that VPC security groups allow inbound traffic on port 443.

Too many requests (429)

The cluster’s bulk queue is full. Zeotap retries with exponential backoff automatically. If 429 errors persist, consider reducing the sync frequency, increasing the cluster’s thread pool queue size, or scaling the cluster.