Tilde.run: Bringing Transactional Filesystems to AI Agent Sandboxing
The deployment of autonomous AI agents into production environments has long been hindered by a fundamental fear: the "rogue agent" scenario. Whether it is an accidental deletion of critical data, an unauthorized network call, or a prompt-injection attack leading to data exfiltration, the risks associated with giving an LLM write-access to a filesystem are significant.
Tilde.run aims to solve this by treating every agent run as a database-like transaction. By combining isolated compute with a versioned, composable filesystem, Tilde allows developers to let agents loose on real data while maintaining a guaranteed recovery path.
The Core Architecture: Composable and Versioned
At the heart of Tilde is a Versioned Composable Filesystem. Unlike traditional sandboxes that provide a blank slate or a simple volume mount, Tilde allows users to mount disparate data sources—GitHub repositories, AWS S3 buckets, and Google Drive documents—into a single, unified ~/sandbox directory.
This is not merely a collection of mounts; it is a versioned layer. Every file is versioned from the first commit, and any agent run can be rolled back instantly. This approach leverages the foundation of lakeFS, providing the same data versioning capabilities used by large-scale data lakes but reimagined for the rapid, iterative nature of AI agents.
Transactional Execution
Tilde treats each agent execution as a transaction. When an agent is launched in a fresh, isolated container:
- Staging: All file writes are staged. The agent interacts with a real POSIX filesystem, meaning it can use any tool or language without needing a specific SDK.
- Atomic Commit: Upon a clean exit, changes are committed atomically.
- Rollback: If the agent fails or produces undesirable results, the entire run is discarded. No manual cleanup or backup restoration is required.
Security and Governance
Beyond the filesystem, Tilde implements a multi-layered security strategy to prevent the most common failure modes of autonomous agents.
Network Isolation
To prevent data exfiltration and credential abuse, Tilde blocks cloud metadata services, private networks, and unauthorized hosts by default. Every outbound request is policy-checked and logged, allowing administrators to see exactly which API calls were allowed or denied (e.g., allowing api.openai.com while blocking evil-exfil.io).
Agent-First RBAC
Tilde introduces a granular Role-Based Access Control (RBAC) system where agents are treated as first-class citizens. Instead of inheriting the full permissions of the user who launched them, agents are assigned scoped permissions via a readable DSL. For example, an analyst agent might be granted READ access to CSVs and WRITE access to reports, but explicitly DENY access to secret keys, with some actions requiring human approval gates.
Technical Discussion and Community Feedback
The announcement of Tilde has sparked a significant debate among the developer community regarding the necessity and implementation of such a system.
The "Standard Tools" Argument
Some critics argue that the functionality Tilde provides can be replicated using standard Linux tools. As one user noted, many of these safeguards can be engineered using Linux VMs and chattr to set folders to read-only. Others pointed out that S3 and Git already provide versioning, questioning the added value of a unified layer.
The State Management Challenge
A recurring point of contention is the limitation of filesystem-level versioning. While Tilde can roll back a file change, it cannot roll back an external API call or a mutation in a remote database. As one commenter observed:
"If the agent is not mutating state the change can be checked in. If it is mutating external state, version control won't save you."
Persistence vs. Ephemerality
There is a clear demand for persistent state in agentic workflows. Some developers expressed the need for agents to have a "computer" with persistent storage that remains consistent across multiple sessions, rather than just transactional runs. This highlights a tension between the desire for absolute safety (ephemeral transactions) and the desire for human-like persistence (stateful environments).
Summary of Capabilities
| Feature | Tilde Approach |
|---|---|
| Filesystem | Composable (GitHub, S3, Drive) and versioned |
| Execution | Isolated containers with atomic commits/rollbacks |
| Network | Default-deny policy with audited outbound calls |
| Permissions | Agent-specific RBAC with human-in-the-loop approval |
| Audit | Full timeline of changes tied to specific agents/humans |