OpenClaw Issue Digest: Connectivity Bottlenecks, Auth Regressions, and Runtime Stability
Open Issues
Recent activity in the OpenClaw repository reveals a series of critical architectural bottlenecks and regressions affecting system stability, particularly concerning the Node.js event loop and provider authentication.
Critical Performance & Architectural Bottlenecks
One of the most severe reports describes a single-threaded event loop bottleneck where the Gateway becomes unresponsive during agent preparation. Tasks spend 14-26 seconds in model resolution and prompt building before a single API call is made, causing WebSocket response times to spike to 100+ seconds. This is compounded by a reported massive virtual memory bloat (22GB+ VIRT) immediately after startup, which, while not always impacting RSS, suggests underlying issues with the V8 ArrayBufferAllocator or native module loading.
Authentication & Provider Regressions
Several high-severity issues have emerged regarding OAuth and API connectivity:
- OpenAI Codex OAuth Failures: Reports indicate that token refresh can fail with
refresh_token_reusederrors, and the system may stick to a stalelastGoodprofile even when fresh profiles are available. Additionally, some users report "incomplete terminal responses" due to gzipped binary data not being decoded by the Gateway's HTTP client. - Anthropic/Claude CLI Issues: On Windows, the
claude-clibackend suffers fromspawn ENOENTandEINVALerrors due to how Node.js handles.cmdand.ps1shims withoutshell: true. - Cloudflare Blocking: Users in mainland China report that the
openai-codexprovider is blocked by Cloudflare JS Challenges because the Node.js native fetch TLS fingerprint is detected as non-browser traffic.
Channel & Delivery Failures
Delivery regressions are appearing across multiple integrations:
- Discord: Users report a "no-mention" false positive where explicit mentions in threads are ignored after a socket-mode reconnect. Additionally, native approval cards for
execcommands are failing to surface. - Telegram: A regression in v2026.5.2 causes a "fetch-timeout storm" on networks without IPv6 egress, where periodic
getMecalls fail every 60 seconds, saturating the event loop. - WhatsApp: Image sending is currently broken in v2026.5.7; images are processed and resized but not attached to outbound messages (
hasMedia: false). - Feishu: Sub-agent auto-replies in group chats are failing to route back, resulting in
replies=0in logs.
Runtime & Sub-agent Orchestration
Sub-agent stability is a recurring theme. Issues include sub-agent announce-back timeouts (10s WS timeouts) and a bug where the subagent-announce flow lacks a SILENT_REPLY_TOKEN guard, leading to duplicate messages when a parent agent has already delivered results. Furthermore, a critical bug in the acpx runtime causes sessions_spawn to fail for non-Codex ACP agents because the runtime forwards an unsupported timeout config option, triggering an ACP_TURN_FAILED error.
Key Themes
1. Event Loop Saturation
There is a systemic pattern of the main Node.js thread being blocked by synchronous preparation work. This manifests as high Event Loop Utilization (ELU), causing timeouts in WebSocket handshakes, fetch operations, and sub-agent announcements. The consensus among reports is that agent preparation (prompt building, plugin loading) must be offloaded to Worker Threads.
2. The "Silent Failure" Pattern
Many reported bugs follow a pattern of "silent drops"—where a process completes successfully in the logs (e.g., stopReason: stop), but the result never reaches the user. This is seen in Discord approval cards, Feishu group replies, and WhatsApp media delivery.
3. Configuration & UX Friction
Users are reporting "silent deprecations" where keys are removed from the schema without warning, causing CLI commands to exit with "Config invalid." There is also a strong request for a doctor --dry-run mode to preview config repairs before they are applied.
Action Required
High Severity / Blockers
- Fix
acpxRuntime Forwarding: Resolve theset_config_option {configId: "timeout"}bug that blocks all non-Codex ACP agent spawns. - Address Event Loop Bottlenecks: Implement async agent preparation and offload channel polling to Worker Threads to prevent Gateway freezes.
- Repair WhatsApp Media Path: Investigate why
hasMediais false for outbound WhatsApp messages despite successful image processing. - Fix Anthropic Gzip Decoding: Ensure the Gateway HTTP client decompresses responses before classification to stop the "incomplete terminal response" errors.
Blocked / Needs Attention
- Discord Socket-Mode Reconnect: Fix the
no-mentionlogic that drops explicit mentions after a reconnect. - Telegram IPv6 Fallback: Restore the "sticky IPv4-only dispatcher" to prevent the
getMetimeout storm on IPv4-only networks. - Sub-agent Announce Guards: Add the
SILENT_REPLY_TOKENto theexpectsCompletionMessagebranch to stop duplicate replies.