← Back to Blogs
GH Issues

OpenClaw Issue Digest: Session Isolation, Provider Stability, and Runtime Regressions

00:30–06:30 UTC May 22, 2026

OpenClaw Issue Digest: Session Isolation, Provider Stability, and Runtime Regressions

Open Issues

The recent activity window for the OpenClaw repository reveals a significant number of high-severity issues centered around session isolation, provider-specific delivery failures, and regressions in the gateway's update and recovery mechanisms.

Session Isolation and Stability

Critical failures in session isolation have been reported, most notably in #84903, where a single stalled agent session can block the entire Gateway event loop, leading to 100% CPU utilization and silent message drops for all other active sessions. This represents a fundamental failure in session isolation. Similarly, #85250 highlights a bug where sessions_yield leaves parent sessions unwakeable by subagent completion events, forcing users to send manual messages to "piggyback" the result.

On the memory front, a P0 security vulnerability (#85240) was identified where the relevant-memories recall mechanism lacks sender_id isolation, potentially leaking private memories from one user into another user's conversation context in multi-user deployments.

Provider and Channel Regressions

Several channel-specific issues have emerged:

  • Telegram: Reports indicate that messages are sometimes silently dropped due to update offset race conditions (#44930) and that forum topic replies can "jump" to the General topic despite topic-qualified sessions (#81874).
  • Feishu: A critical routing bug (#45158) causes all messages to be routed to a single agent regardless of the configured bindings, leading to session pollution and privacy leaks. Additionally, the Feishu webhook mode currently ignores configured webhookPath and accepts signed requests on arbitrary paths (#54841).
  • Discord: Internal tool-call traces (e.g., NO_REPLY, commentary) are intermittently leaking into user channels (#44905).

Runtime and Provider Stability

Stability issues are prevalent in the Codex and Anthropic runtimes. The Codex app-server is experiencing silent truncation of long replies at ~1100 characters (#84516) and startup failures on Windows due to fragile command override handling (#84365).

For Anthropic providers, a regression in group chat context injection (#83419) creates consecutive user-role messages, which violates Anthropic's API requirements and triggers 500 errors via OpenRouter, causing silent fallbacks to Gemini models.

Key Themes

1. The "Stall and Block" Pattern

There is a recurring theme of internal bottlenecks causing system-wide failures. Whether it is the event loop blockage (#84903), the UV_THREADPOOL_SIZE limitation causing simultaneous API timeouts (#43374), or the Codex terminal-idle watchdog causing misleading timeouts (#85242), the system is struggling with concurrency and resource isolation.

2. State Persistence and Recovery

Issues with how state is persisted and recovered are frequent. This includes the "last-write-wins" race condition in exec-approvals.json (#44749), the loss of session history due to aggressive daily-reset archiving (#45003), and the lauchd-managed gateway failing to restart after an update due to inherited XPC_SERVICE_NAME environment variables (#85224).

3. Schema-Driven Model Misbehavior

Several issues stem from the tool schema being too permissive, leading models (particularly GPT-5.x) to auto-populate optional fields that then trigger strict runtime guards. This is evident in the message.send action where poll fields or Discord modal skeletons cause valid messages to be rejected (#43015, #42820).

Action Required

High-Severity / Blocked Issues

  • #85240 (P0 Security): Immediate implementation of sender_id filtering in the memory recall layer to prevent cross-user data leakage.
  • #84903 (P1): Urgent need for per-session timeout budgets and better async isolation to prevent a single stalled session from crashing the Gateway event loop.
  • #84886 (Beta Blocker): Implementation of a durable message dispatch idempotency ledger for Telegram to prevent duplicate agent turns during recovery.
  • #85228 (P1): Optimization of the xAI OAuth auth stage to reuse cached tokens, reducing per-turn latency from ~13s to near-instant.

Critical Contributor Attention

  • #44749: Fix the read-modify-write race in addAllowlistEntry using a mutex or re-read-before-write pattern.
  • #83419: Merge metadata and actual user messages into a single user role block for Anthropic API compatibility.
  • #85246: Resolve the handoff deadlock in the UI Update flow for npm global + launchd installations on macOS.

References

Issues