GH Issues

OpenClaw Issue Digest: Session Isolation, Provider Stability, and Runtime Regressions

00:30–06:30 UTC May 22, 2026

OpenClaw Issue Digest: Session Isolation, Provider Stability, and Runtime Regressions

Open Issues

The recent activity window for the OpenClaw repository reveals a significant number of high-severity issues centered around session isolation, provider-specific delivery failures, and regressions in the gateway's update and recovery mechanisms.

Session Isolation and Stability

Critical failures in session isolation have been reported, most notably in #84903, where a single stalled agent session can block the entire Gateway event loop, leading to 100% CPU utilization and silent message drops for all other active sessions. This represents a fundamental failure in session isolation. Similarly, #85250 highlights a bug where sessions_yield leaves parent sessions unwakeable by subagent completion events, forcing users to send manual messages to "piggyback" the result.

On the memory front, a P0 security vulnerability (#85240) was identified where the relevant-memories recall mechanism lacks sender_id isolation, potentially leaking private memories from one user into another user's conversation context in multi-user deployments.

Provider and Channel Regressions

Several channel-specific issues have emerged:

Telegram: Reports indicate that messages are sometimes silently dropped due to update offset race conditions (#44930) and that forum topic replies can "jump" to the General topic despite topic-qualified sessions (#81874).
Feishu: A critical routing bug (#45158) causes all messages to be routed to a single agent regardless of the configured bindings, leading to session pollution and privacy leaks. Additionally, the Feishu webhook mode currently ignores configured webhookPath and accepts signed requests on arbitrary paths (#54841).
Discord: Internal tool-call traces (e.g., NO_REPLY, commentary) are intermittently leaking into user channels (#44905).

Runtime and Provider Stability

Stability issues are prevalent in the Codex and Anthropic runtimes. The Codex app-server is experiencing silent truncation of long replies at ~1100 characters (#84516) and startup failures on Windows due to fragile command override handling (#84365).

For Anthropic providers, a regression in group chat context injection (#83419) creates consecutive user-role messages, which violates Anthropic's API requirements and triggers 500 errors via OpenRouter, causing silent fallbacks to Gemini models.

Key Themes

1. The "Stall and Block" Pattern

There is a recurring theme of internal bottlenecks causing system-wide failures. Whether it is the event loop blockage (#84903), the UV_THREADPOOL_SIZE limitation causing simultaneous API timeouts (#43374), or the Codex terminal-idle watchdog causing misleading timeouts (#85242), the system is struggling with concurrency and resource isolation.

2. State Persistence and Recovery

Issues with how state is persisted and recovered are frequent. This includes the "last-write-wins" race condition in exec-approvals.json (#44749), the loss of session history due to aggressive daily-reset archiving (#45003), and the lauchd-managed gateway failing to restart after an update due to inherited XPC_SERVICE_NAME environment variables (#85224).

3. Schema-Driven Model Misbehavior

Several issues stem from the tool schema being too permissive, leading models (particularly GPT-5.x) to auto-populate optional fields that then trigger strict runtime guards. This is evident in the message.send action where poll fields or Discord modal skeletons cause valid messages to be rejected (#43015, #42820).

Action Required

High-Severity / Blocked Issues

#85240 (P0 Security): Immediate implementation of sender_id filtering in the memory recall layer to prevent cross-user data leakage.
#84903 (P1): Urgent need for per-session timeout budgets and better async isolation to prevent a single stalled session from crashing the Gateway event loop.
#84886 (Beta Blocker): Implementation of a durable message dispatch idempotency ledger for Telegram to prevent duplicate agent turns during recovery.
#85228 (P1): Optimization of the xAI OAuth auth stage to reuse cached tokens, reducing per-turn latency from ~13s to near-instant.

Critical Contributor Attention

#44749: Fix the read-modify-write race in addAllowlistEntry using a mutex or re-read-before-write pattern.
#83419: Merge metadata and actual user messages into a single user role block for Anthropic API compatibility.
#85246: Resolve the handoff deadlock in the UI Update flow for npm global + launchd installations on macOS.

OpenClaw Issue Digest: Session Isolation, Provider Stability, and Runtime Regressions

Open Issues

Session Isolation and Stability

Provider and Channel Regressions

Runtime and Provider Stability

Key Themes

1. The "Stall and Block" Pattern

2. State Persistence and Recovery

3. Schema-Driven Model Misbehavior

Action Required

High-Severity / Blocked Issues

Critical Contributor Attention

References

Issues