← Back to Blogs
HN Story

Inside Aether: A Formally-Verified, High-Performance Storage Engine in Rust

May 20, 2026

Inside Aether: A Formally-Verified, High-Performance Storage Engine in Rust

The development of a database storage engine is often seen as one of the most challenging undertakings in systems programming. It requires a delicate balance between raw performance, strict ACID compliance, and the absolute guarantee that data will not be lost during a catastrophic crash. Enter Aether, a high-performance storage engine written in Rust that attempts to synthesize several cutting-edge academic and industry concepts into a single, cohesive framework.

By integrating formal verification via TLA+ with high-performance primitives like swizzled pointers and the Taurus WAL algorithm, Aether aims to provide a foundation for everything from Redis-compatible caches to specialized financial transaction databases. This post dives into the architecture, performance characteristics, and the technical philosophy behind Aether.

The Architectural Stack

Aether is designed as a layered system, ensuring that high-level API calls are translated efficiently into page-level disk I/O. The architecture follows a strict hierarchy:

  1. API Layer: Provides the KvStore and DbEnv interfaces for application interaction.
  2. Index Layer: Supports multiple indexing strategies, including B+ Trees, Skiplists, and RAX radix trees.
  3. LSM Framework: An optional layer providing pluggable compaction strategies, most notably a pure HanoiDB implementation.
  4. Transactions & Recovery: Implements the ARIES recovery protocol to ensure atomicity and durability.
  5. WAL (Write-Ahead Logging): Utilizes the Taurus algorithm for lock-free, per-thread logging.
  6. Buffer Manager: A LeanStore-inspired manager using swizzled pointers to minimize overhead.
  7. Page Layer: Manages the physical layout of data in 4KB pages.

Key Technical Innovations

LeanStore-Inspired Buffer Management

Aether implements a buffer manager based on the LeanStore paper, which focuses on reducing the overhead of mapping virtual page IDs to physical memory addresses. By using swizzled pointers, Aether can transition pages between HOT, COOL, and EVICTED states. This allows the "hot path" for page access to be as fast as 24ns (approximately 41 million operations per second), bypassing expensive hash table lookups for frequently accessed data.

Formal Verification with TLA+

AUnlike many storage engines that rely solely on extensive testing, Aether employs TLA+ (Temporal Logic of Actions) to formally verify its core logic. The specifications specifically target:

  • LSN (Log Sequence Number) Uniqueness: Ensuring monotonicity and uniqueness to prevent log corruption.
  • Crash Recovery: Verifying that no data is lost during a crash and that the ARIES redo/undo phases are logically sound.
  • Concurrent Safety: Proving that the system remains consistent under highly concurrent access patterns.

The HanoiDB LSM Implementation

For write-heavy workloads, Aether provides an LSM tree framework. Its implementation of HanoiDB is particularly noteworthy for its claim of constant 2.0x write amplification. By distributing compaction work across writes (incremental merging), Aether achieves remarkably stable latency, with p99s in the 1-2µs range, effectively eliminating the massive latency spikes typically associated with LSM compaction.

Performance Benchmarks

Aether's design choices translate into significant performance gains across different components:

  • WAL Throughput: The use of lock-free per-thread streams and group commits allows the WAL to scale linearly with thread count, making it optimal for high-concurrency environments.
  • LSM Latency: The project reports a p999 latency of 4-22µs, which is orders of magnitude better than traditional LSM implementations that suffer from "stop-the-world" compaction events.
  • Recovery: The ARIES-based recovery scales linearly with the size of the log, ensuring predictable boot times after a failure.

Community Reception and Critical Perspectives

Despite its technical ambition, Aether has sparked a debate within the developer community regarding the nature of modern software development and the definition of "verification."

Some critics argue that formal verification of a model is insufficient if the implementation does not strictly refine that model. As user @Kab1r noted:

"I don't think formally verifying my showing that the model is correct is good enough anymore. You must prove that your implementation refines the model."

Others have questioned the provenance of the code, suggesting that the sheer complexity and speed of development indicate heavy reliance on LLMs. While some view this as a "100x multiplier" for domain experts, others are skeptical of "vibe-coded" software when it comes to mission-critical data storage, preferring battle-tested alternatives like SQLite or PostgreSQL.

Conclusion

Aether represents a bold attempt to bring academic rigor (TLA+) and high-performance research (LeanStore, HanoiDB) into a practical Rust implementation. Whether it is intended as a learning project or a production-ready engine, it serves as a sophisticated case study in how to build a modern, ACID-compliant storage system from the ground up.

References

HN Stories