OpenClaw Issue Digest: Critical Event Loop Bottlenecks and Multi-Agent Instability
Open Issues
The recent activity window reveals a system under significant architectural strain, particularly regarding the Node.js event loop and the stability of multi-agent orchestration. While several quality-of-life feature requests for the Control UI and channel adapters (Telegram, Discord, Feishu) have been submitted, the core focus remains on resolving critical performance bottlenecks and data-loss risks.
Critical Performance and Stability Issues
Several reports indicate that the OpenClaw Gateway is suffering from severe event loop blocking. One critical report describes WebSocket response times jumping to 100+ seconds and agent dispatch taking up to 26 seconds before a model is even called. The root cause is identified as a single-threaded architecture where heavy synchronous operations—such as macOS keychain access via execSync and various readFileSync calls—monopolize the main thread, effectively freezing the system for all users.
Similarly, issues with the openai-codex harness have surfaced, where a regression in version 2026.5.12 causes a "harness not registered" error, silently forcing agents onto fallback models and degrading performance and cost predictability.
Multi-Agent and Session Routing Failures
Orchestration instability is a recurring theme. Users report that concurrent agents add commands race on the global config file, leading to lost agent configurations. Furthermore, isolated agent runs are tripping session lock timeouts, and some runs appear to fail at the CLI layer while continuing to mutate worktrees in the background as "detached" processes.
Routing errors are also prevalent in channel-specific implementations:
- Feishu: Reports indicate that all messages are being routed to a single agent regardless of bindings, and group chat replies are erroneously dispatching to the main webchat session.
- Discord: A critical preflight bug allows messages to pass through mention-gating when a mention appears in quoted text rather than as an active target.
- Telegram: Forum-topic voice notes are reaching agents as raw audio, bypassing the transcription and echo paths that work in direct messages.
Memory and Context Management
Memory management remains a point of contention, with reports of "chaos" in how different users experience chunking and embedding. Specific technical failures include MEMORY.md silent truncation at 20K characters and a bug where max_tokens is not subtracting used input tokens, leading to API "too large" errors for models with large context windows.
Key Themes
1. Architectural Bottlenecks
There is a clear push toward moving away from synchronous I/O and single-threaded dispatch. The current model of rebuilding system prompts and resolving model configurations on every turn is creating unacceptable latency.
2. Security and Data Integrity
Security concerns are rising, with requests for automatic output sanitization to prevent API key leakage and a critical report regarding the gh-issues skill, which injects untrusted issue bodies directly into sub-agent prompts, creating a prompt injection vector for agents with shell access.
3. Tooling and UX Refinement
There is a strong demand for better developer and operator tooling, including a stable Plugin SDK to prevent "correctness drift" in third-party skills and a more robust openclaw doctor that provides actionable repair hints rather than raw schema errors.
Action Required
High Severity / Blockers
- Event Loop Optimization: Immediate refactoring of
execSyncandreadFileSyncin the critical path is required to prevent gateway freezes. - Codex Harness Registration: Fix the harness name mismatch in 2026.5.12 to restore
openai-codexfunctionality. - Discord Mention Gating: Tighten the attention window for mention detection to prevent incorrect routing.
- Prompt Injection in
gh-issues: Implement sanitization or isolation for untrusted issue bodies to prevent arbitrary command execution.
Blocked or High-Attention Issues
- Multi-Agent Config Races: Implement file-locking or serialization for
agents addto prevent config overwrites. - Feishu Routing: Resolve the regression where multiple agents are collapsed into a single session.
- Subagent Completion Loss: Address the silent loss of subagent results when announce-back fails, which currently leaves parent sessions in a permanent "waiting" state.