Skip to content

GreptimeDB vs. Elasticsearch

Inverted indexes were built for text search, not observability storage. For traces, ES inflates storage up to 45x. For logs, GreptimeDB ingests 4.7x faster and stores 1/10 the data in benchmark tests.

Inverted indexes power search.
They also inflate your observability storage.

Elasticsearch is a distributed search and analytics engine built on Apache Lucene, widely used for log analysis, APM, and enterprise search. Its inverted index excels at full-text search, but for observability workloads — especially traces and high-volume logs — the storage overhead is structural: JSON document model + inverted indexes + multiple index replicas inflate data significantly. Click here to read the full log benchmark report. For trace storage comparison, see the storage efficiency analysis.

CHALLENGER

Elasticsearch

Search-first architecture where observability data inflates fast

  • Indexes all fields by default — high-cardinality trace/span IDs inflate storage up to 45x
  • Log ingestion at 39K TPS — GreptimeDB achieves 185K TPS (4.7x) in [benchmark](/blogs/2025-04-24-elasticsearch-greptimedb-comparison-performance)
  • JVM tuning, shard management, and ILM add operational burden
VS

GREPTIMEDB

GreptimeDB

Columnar storage on S3, designed for observability retention from day one

  • Columnar compression stores observability data at 1/10 the size (benchmark)
  • Flexible indexing — skipping index for high-cardinality trace/span IDs, bloom-filter fulltext for logs, inverted index where it fits
  • 4.7x faster log ingestion in benchmark tests, S3 write with only 1-2% throughput loss
  • Jaeger UI compatible — migrate traces in about a week
Architecture comparison

Why Elasticsearch clusters grow operationally complex - and why GreptimeDB keeps the path simpler.

Elastic Stack

5-9 COMPONENTS

Beats / OTel Collector / Logstash

collect + transform

Ingest Nodes + Data Nodes

index + store

Master / Coordinator Nodes

cluster routing

ILM + Snapshot Repositories

retention + archive

GreptimeDB

1 DATABASE

Frontend node (stateless, auto-scale)

query + ingest gateway

Datanode (compute, stateless)

native object storage

  • Unified SQL + PromQL workflow for observability
  • Native support for logs, metrics, and traces
  • Scale compute and storage independently
Feature comparison
DimensionGreptimeDBElasticsearch
Storage formatApache Parquet (columnar, compressed)Lucene segments with inverted indexes
Indexing strategyPer-field: inverted, skipping, bloom-filter fulltext, vectorIndexes all fields by default — storage-heavy on high-cardinality data
Storage efficiency~1/10 of ES for log data; up to 45x reduction for tracesJSON + inverted index + replicas inflate storage
Query languageSQL + PromQL (dual interface)Query DSL (JSON-based), Elasticsearch SQL (built-in)
Data typesMetrics + Logs + Traces in one databasePrimarily documents (separate systems for metrics)
Ingestion throughput185K TPS (structured logs)39K TPS (structured logs, 4.7x slower)
Storage backendNative object storage (S3, OSS, GCS)Local disk, snapshot to S3 for cold data
Scaling modelCompute-storage disaggregation, stateless nodesMaster-data node cluster with shard management
Trace compatibilityJaeger Query API compatible, migrate in ~1 weekNative Jaeger/ES backend
OpenTelemetryNative OTLP (all signals)Via Elastic APM / Observability stack
LicenseApache 2.0ELv2 / SSPL / AGPLv3 (triple-licensed since 2024)
Operational complexitySingle system for observabilityELK/EEK stack: multiple components to manage

Log performance data from benchmark tests. Trace storage comparison based on production migration case study. Results vary by workload.

Migration path - as fast as one week

Keep your existing pipelines. Move ingest and query incrementally.

Redirect ingest

Docs

Point compatible ingest endpoints (OTLP, HTTP, Bulk API) to GreptimeDB while existing agents remain unchanged.

30 min

Switch dashboards

Docs

Update Grafana datasource to GreptimeDB for metrics/logs/traces views and keep dashboards running during transition.

1 hour

Backfill historical data

Export snapshots or historical indices and bulk import into GreptimeDB without interrupting live writes.

1-3 days

Decommission Elastic cluster

After verification, gradually scale down Elasticsearch data nodes and retire redundant ingest/index pipelines.

2 Weeks

Stay in the loop

Join our community