OpenClaw Issue Digest: Codex Runtime Failures, Session Lane Starvation, and UI Regressions
Open Issues
Recent activity in the OpenClaw repository reveals a cluster of high-severity issues affecting the Codex runtime, session management, and channel-specific delivery. Most notably, users are reporting critical failures in the OpenAI/Codex primary route and significant performance regressions on macOS.
Codex Runtime and Model Routing
Several issues highlight a fragile state for the Codex runtime. In #81213 and #81175, users report that while the Codex loader may now succeed, the primary OpenAI route frequently times out (HTTP 408), forcing a fallback to Anthropic/Pi. Furthermore, a critical packaging regression in #81175 causes MODULE_NOT_FOUND errors for the codex-native-task-runtime due to incorrect import paths in the bundled distribution.
Additionally, #81326 identifies a configuration gap where agentRuntime.id=codex can be set without the Codex plugin being installed, leading to "harness not registered" failures that are not flagged by plugins doctor.
Session Lane and Execution Stability
Session "starvation" and locking have emerged as recurring themes. Issue #81335 describes a scenario where LLM provider timeouts leave session lanes locked, effectively freezing group chats until a full gateway restart. Similarly, #81375 reports that implicit cron jobs bound to live chat session keys can occupy lanes, delaying inbound user replies after a system restart.
In the cron subsystem, #81368 and #80888 reveal a critical watchdog bug where isolated runs are killed after 60 seconds because the model_call_started event is never emitted by the Pi or CLI runners, even while the model is actively generating output.
Channel and UI Regressions
Channel reliability is currently inconsistent across several platforms:
- WhatsApp: #81369 reports a regression where raw Codex tool/XML scaffolding (e.g.,
<tool_calls>) leaks into visible chat channels during heartbeat delivery. #81358 and #81322 report that reply text and images are intermittently dropped when responses include tool calls. - Feishu: #78274 and #76104 describe routing failures where replies are sent to the webchat dashboard instead of the origin Feishu channel, particularly in group chats.
- WebChat/TUI: #81339 reports that leading whitespace in code blocks is stripped, breaking ASCII diagrams. #67035 describes a severe Windows UI regression where input text is swallowed and streamed replies are invisible until a page refresh.
Performance and System-Level Bugs
On macOS, #73743 reports a massive CLI startup regression, with openclaw recipes list taking ~25s on idle systems and stretching to 5-6 minutes under load, causing downstream timeouts in ClawKitchen. Additionally, #66977 highlights a platform-specific failure where sqlite-vec cannot load on macOS because node:sqlite is compiled with OMIT_LOAD_EXTENSION.
Key Themes
1. The "Silent Failure" Pattern
Across multiple subsystems, OpenClaw is failing silently or reporting success for failed operations. Examples include:
- Subagents: #81345 and #79285 report subagents that produce zero tokens or are terminated mid-task but are still marked as
completed successfully. - Cron Delivery: #79753 reports cron announcements that mark
delivered: truedespite the message never arriving on WeChat or Feishu. - Write Tool: #67136 describes the write tool reporting successful byte writes while no file is actually created.
2. Context and Memory Contamination
There are significant issues with how context is handled across turns and channels. #81286 reports a high-severity bug where conversation history is not passed between runs, leading to "conversation amnesia." In multimodal contexts, #66702 reports that stale images from previous turns are reintroduced into the active vision prompt, contaminating current analysis.
3. Sandbox and Environment Isolation
Environment overrides are causing operational friction. #78202 notes that the Codex harness overrides HOME, breaking scripts that rely on ~ expansion. #66612 reports that the memory flush sandbox blocks writes to SESSION_HANDOFF.md, contradicting the prompt's own instructions.
Action Required
Critical Priority (Immediate Attention)
- Codex Runtime Fixes: Resolve the
MODULE_NOT_FOUNDpackaging error (#81175) and the primary route timeouts (#81213) to restore OpenAI/Codex functionality. - Session Lane Recovery: Implement a mechanism to release session lanes after provider timeouts (#81335) and fix the cron watchdog's failure to recognize active model calls (#81368).
- UI/UX Blockers: Address the Windows chat UI regression (#67035) and the macOS CLI startup latency (#73743).
High Priority (Stability & Reliability)
- History Persistence: Fix the
messagesSnapshotinjection failure (#81286) to restore multi-turn conversation capabilities. - Channel Delivery: Fix the silent dropping of WhatsApp media and text (#81358, #81322) and the Feishu routing errors (#78274).
- Subagent Integrity: Implement validation to ensure subagents that produce zero tokens are not marked as successful (#81345).
Medium Priority (Maintenance)
- macOS Compatibility: Address the
sqlite-vecextension loading issue (#66977) to enable vector search on macOS. - Audit Noise: Implement the capability manifest field (#80573) to reduce false-positive exfiltration warnings in security audits.