Building a RAR Implementation in Rust Using LLMs

The creation of a full RAR compressor is a monumental task. Historically, implementing such a tool would have taken years of manual reverse-engineering and coding, largely because the RAR format is a "middle-aged format that never stopped growing up," boasting complex features like multi-volume support, recovery records, and an internal VM.

In a recent project by developer davidsong, the timeline for this effort was compressed from a theoretical five-year project into five weeks of evenings and weekends. The secret weapon? A combination of OpenAI Codex 5.5 and Claude Opus 4.7. The result is rars, a Rust-based implementation of the RAR format that, while admittedly "sloppy" and "slow," provides the world with a free software alternative to a proprietary format.

The Reverse-Engineering Process

Because a formal specification for RAR does not exist, the author began by synthesizing a spec from fragmented sources. This involved pulling data from free decompressors like unar, libarchive, and UNRARLIB, as well as various web pages and community folklore.

Claude was tasked with documenting these sources. Through a process of iterative querying and maintaining a "gaps doc" to track missing features, the author spent two weeks refining the reader side of the format. The writer side, however, was more elusive, requiring a combination of hex-dumping, Ghidra, and DOSBox-x to analyze RAR binaries for DOS and Windows.

This rigorous process eventually yielded comprehensive spec docs for every version of the RAR file format, which are now available as a public resource.

LLM Workflow and Strategy

Implementing the code from the spec was a precarious balance of managing different AI models. The author noted distinct personalities and strengths for each:

Claude Opus: Excellent for strategy and architecture discussions, though prone to generating code without considering the big picture.
OpenAI Codex 5.5: Highly effective at following specs and staying on target, though it could "rabbit hole" if prompted too much.

Managing "Slop" and Hallucinations

One of the primary challenges was preventing the codebase from becoming an unmanageable mass of "slop." The author employed several strategies to maintain quality:

Massive Testing: The author implemented an excessive number of unit tests. While some were fragile, they provided a "statistical mass" that steered the LLMs back on track when they attempted to cut corners.
Cross-Cutting Reviews: Claude was used to generate full code reviews, which were then filtered into a plan.md file to drive development tasks. This prevented the AI from becoming overly focused on nitpicks.
Empirical Validation: By grinding against real-world RAR archives, the author was able to clear up "autofill bullshit" (hallucinations) that had previously passed multiple rounds of review.

The Role of Autonomous Agents

A significant breakthrough occurred with the introduction of OpenAI's /goal feature. This allowed the bot to work autonomously for hours—sometimes up to 16 hours—compacting its context and filling in the bulk of the work. This autonomous loop handled complex features like recovery records, encryption, and multi-volume support, flood-filling roughly 40,000 lines of code.

Performance and Trade-offs

While the project succeeded in functionality, it highlighted the limits of current LLMs in terms of optimization:

Compression Efficiency: Surprisingly, the LLM-generated code achieved compression ratios within 5-10% of WinRAR, applying well-known techniques from other compressors to optimize LZSS.
Execution Speed: Performance lagged. Codex struggled to find the "novel performance hacks" that a seasoned C developer would use to optimize hot loops, resulting in code that was multiple times slower than the original.
UX and Blind Spots: The AI frequently overlooked the obvious. For example, Claude's UAT reviews missed the fact that the user experience was a "terrible mass of machine readable noise" until specifically prompted to analyze the UX.

Key Lessons Learned

The project serves as a case study in the future of software development with LLMs. The author's primary takeaways include:

Spec-Driven Development: Working from a detailed spec is highly effective.
Rust Proficiency: Modern models are very capable of writing Rust.
Autonomous Research: The ability for models to perform autonomous research is a powerful tool.
Context Steering: Tests, documentation, and comments act as anchors that keep the AI on track.
Architectural Control: If the developer does not provide the architecture, they will eventually pay a "refactoring fee."
Performance Limits: Do not expect novel insights or high-performance optimization from current models.

Ultimately, rars stands as a testament to the project's feasibility: a free software implementation of a proprietary format, achieved through a combination of human guidance and machine-generated code.

Building a RAR Implementation in Rust Using LLMs

Building a RAR Implementation in Rust Using LLMs

The Reverse-Engineering Process

LLM Workflow and Strategy

Managing "Slop" and Hallucinations

The Role of Autonomous Agents

Performance and Trade-offs

Key Lessons Learned

References

HN Stories