Why Thermocline?
The document database built for AI workloads, cost efficiency, and operational simplicity
Connect to Thermocline with any MongoDB driver. Same wire protocol, same MQL, same aggregation pipelines. Just change your connection string.
Built-in HNSW and DiskANN indexes for semantic similarity. $vectorSearch aggregation stage. Store documents and embeddings together.
Automatically tier historical data to cost-optimized object storage while maintaining full query access. Pay hot prices only for hot data.
Runs on any Kubernetes cluster with any S3-compatible object storage. AWS, GCP, Azure, on-premises — your infrastructure, your choice.
Features
Everything you need for an enterprise-grade AI-native document database
Query Federation
Transparently split, route, and merge queries across the hot Storage Engine and cold object storage in a single operation
Policy-Driven Lifecycle
Define archival rules by age, size, or custom predicates. Data moves automatically from hot to cold with zero downtime
Complete MQL Translation
Every MongoDB query operator, aggregation stage, and cursor operation works seamlessly on cold data via Parquet translation
Multi-Cloud Storage
Native adapters for AWS S3, Google Cloud Storage, Azure Blob Storage, and any S3-compatible endpoint like MinIO
Wire Protocol Compatible
Full MongoDB 4.4+ wire protocol implementation. Connect with any MongoDB driver — no SDK, no library, no changes
Security & RBAC
Native SCRAM-SHA-256 auth, TLS 1.3, granular RBAC for vector search and time travel, encryption at rest, and audit logging
Full Observability
Prometheus metrics, structured JSON logging, OpenTelemetry distributed tracing, and pre-built Grafana dashboards
High Availability
Raft consensus replication with sub-5s failover. Multi-replica, multi-AZ deployments with no single point of failure
Kubernetes Native
Production-ready Helm charts, Horizontal Pod Autoscaling, resource quotas, and GitOps-compatible deployment patterns
Backup & Restore
WAL-based point-in-time recovery for hot and cold data. Automated scheduled backups with cross-region DR support
Intelligent Caching
Seven cache tiers including block cache, vector index cache, MVCC version cache, and WAL read cache with adaptive eviction
Parquet Storage Format
Columnar Parquet format with configurable compression (zstd, snappy, gzip) for optimal query performance and storage efficiency
Architecture
A standalone AI-native database with automatic hot/cold tiering
Community
Thermocline is SSPL licensed and community-driven