OpenClaw Update: Hardening Sandboxes, Optimizing Model Discovery, and Resolving Session State Races
The latest set of merged pull requests for OpenClaw addresses a diverse range of technical challenges, from critical security boundaries in agent execution to significant performance bottlenecks in the gateway. The primary focus of this window has been hardening the containment of Codex-native execution and optimizing the way the system handles provider authentication and session state.
Merged PRs
- fix(agents): surface blocked subagent completions Original PR
- Harden Codex native execution in OpenClaw sandboxes Original PR
- perf(plugins): thread install records through plugin load options Original PR
- fix(cli): preserve numeric config set record keys Original PR
- fix(agents): tolerate in-process session writes during prompt release Original PR
- Handle Codex turns missing completion Original PR
- fix(sessions): reduce session-store memory retention Original PR
- Fix Discord session recovery abort ownership Original PR
- Log Discord component registry error details Original PR
- fix(channels): hint at
openclaw doctor --fixwhen bundled channel module is missing Original PR - Preserve reusable Discord presentation buttons Original PR
- Allow Discord component registry TTL override Original PR
- Fix Ollama cloud API key discovery Original PR
- fix(cli): reject invalid node run port Original PR
- Fix/codex deactivated workspace failover Original PR
- fix(agents): classify auth HTML provider responses Original PR
- fix(gateway): allow bearer-auth session history reads Original PR
- fix(exec): protect pathPrepend against posix login-shell RC overrides Original PR
- refactor(gateway): remove unused readLastMessagePreviewFromTranscript helper Original PR
- test: fix environment sensitivity in resolveNpmCommandInvocation test Original PR
- fix(auth): load legacy Codex OAuth sidecars in embedded secrets-runtime loaders Original PR
- perf(models): /models 20s → 5ms via pre-warmed provider auth state Original PR
- fix(qa-lab): rename codex lifecycle fixtures to match knip ignore pattern Original PR
Key Changes
Security and Sandbox Hardening
A critical security regression was addressed where Codex-native execution was bypassing the OpenClaw sandbox contract, effectively running in the gateway container rather than the per-agent sandbox. The system now "fails closed" by default, blocking Codex-native execution whenever an active sandbox is present. To maintain functionality, OpenClaw now exposes sandbox_exec and sandbox_process tools by default and introduces an explicit opt-in (appServer.experimental.sandboxExecServer) for routing native execution through a sandbox-backed exec-server integration.
Additionally, the tools.exec.pathPrepend configuration was hardened for POSIX environments. Previously, user login-shell RC files (like .zshenv) could override the intended path priority. The fix now leverages native shell expansion of OPENCLAW_PREPEND_PATH immediately before command execution, ensuring configured tools take absolute precedence.
Gateway Performance and API Optimization
One of the most significant performance gains was achieved in the /models endpoint. By pre-warming provider authentication state at gateway startup, the response time for listing models dropped from ~20 seconds to ~5 milliseconds. This resolves a critical issue in Discord where the bot would fail to acknowledge interactions within the required 3-second window due to the event loop being blocked by auth-discovery sweeps.
On the API side, the gateway now allows Bearer-auth session history reads, aligning /sessions/:key/history with other OpenAI-compatible endpoints and preventing unnecessary 403 errors for API key users.
Session State and Memory Management
To combat memory leaks in high-load deployments, the session store was refactored to reduce memory retention. The system now interns repeated large prompt strings and provides immutable read APIs to avoid the costly duplication of full-store clones.
Furthermore, a race condition causing EmbeddedAttemptSessionTakeoverError was resolved. The embedded runner now distinguishes between legitimate internal writes (from the same process) and external session hijacking, preventing agent runs from failing when internal session hooks update the transcript during prompt processing.
Agent and Provider Reliability
Several fixes were implemented to improve the reliability of agent completions and provider failovers:
- Subagent Visibility: Blocked subagent completions (e.g., due to context overflow) are now correctly surfaced as errors to the parent agent instead of being reported as "completed successfully."
- Codex Stability: The system now handles Codex app-server turns that stop before a
turn/completedevent, preventing sessions from becoming stuck and ensuring the lane is released for subsequent turns. - Auth Classification: Provider failures returning HTML 401 responses are now correctly classified as authentication failures rather than generic CDN/gateway blocks, providing users with actionable re-authentication guidance.
Impact
These changes significantly improve the stability and security of OpenClaw, particularly for users employing the Codex harness and Docker sandboxing. The massive speedup of the /models command removes a major friction point for Discord users and improves the overall responsiveness of the gateway.
From a developer and operator perspective, the reduction in session-store memory amplification and the hardening of the POSIX execution path reduce the likelihood of OOM (Out-of-Memory) crashes and unpredictable tool behavior in complex environments. The refinement of subagent error reporting and Codex turn handling ensures that agentic workflows are more transparent and less prone to silent failures.