OpenClaw Gateway Performance and Stability Report: May 7, 2026
The recent activity in the OpenClaw repository reveals a period of significant architectural transition and stability challenges. As the platform moves toward externalizing core channel extensions into separate npm packages, several critical regressions have emerged, primarily centered around packaging errors, event loop saturation, and broken authentication flows for external plugins.
Of particular concern are the systemic performance degradations where synchronous I/O and inefficient tool initialization are blocking the Node.js main thread, effectively neutralizing the benefits of multi-core hardware and rendering multi-agent concurrency unusable in high-load environments.
Open Issues
Event Loop Saturation and Performance
Several reports highlight a critical bottleneck in the Gateway's single-threaded architecture. Users running multiple channel integrations (e.g., 16+ Telegram bots) report event loop utilization hitting 97-100%, with P99 delays exceeding 30 seconds. This is exacerbated by:
- Synchronous I/O: Use of
execSyncfor macOS keychain access andfs.readFileSyncfor credential checks is causing the main thread to freeze for up to 4 seconds (#78805). - Tool Initialization Overhead: The built-in
pdf-toolsynchronously initializes on every agent run, adding 6-11 seconds of blocking time per turn, regardless of whether the tool is actually used (#77204, #77316). - Plugin Factory Costs: Every embedded run re-invokes plugin factories, leading to significant per-turn overhead. Even on warm turns, plugin loading can take several seconds, suggesting a need for a one-time construction of the tool registry (#77180).
Packaging and Migration Regressions
The transition from built-in extensions to external @openclaw/* packages has introduced several "silent" failures:
- Missing Runtime Dependencies: Versions 2026.5.3-1 are missing critical
install.runtime-*.jsfiles, blocking plugin updates and installations (#77289, #77293). - Silent Plugin Drops: Upgrading from 4.x to 5.x silently drops built-in channels (WhatsApp, Discord, etc.) without warning the user to install the new external versions (#77483).
- Manifest Mismatches: The Discord plugin (
@openclaw/discord@2026.5.3) suffers from broken manifests, includingonStartup: falseand empty channel contracts, preventing the channel from starting even when the plugin is loaded (#77354).
Authentication and Secret Resolution
SecretRef resolution is currently unreliable for external plugins. A recurring issue is that resolvePluginContractApiPath fails to search the dist/ subdirectory, meaning Discord and other npm-installed plugins cannot resolve their tokens from the runtime snapshot, leading to error:not configured states (#77241, #77416).
Key Themes
The "Single-Threaded Ceiling"
There is a growing consensus that the current Node.js main-thread architecture is a fundamental scalability limit. As highlighted in #78808, hardware upgrades are wasted because the event loop is monopolized by I/O-heavy channel polling. The proposed solution is to offload channel polling to dedicated Worker Threads to prevent I/O completion callbacks from starving agent tasks.
Persistence and History Corruption
Regressions in session transcript handling are causing data loss in the WebChat UI. Specifically, the removal of SessionManager in v5.2 has replaced true appends with a read-migrate-rewrite cycle using fs.writeFile, which can overwrite transcripts on every turn and orphan previous message branches (#77012). Additionally, assistant text responses in tool-using turns are frequently not persisted to the JSONL transcript, causing messages to vanish upon history reload (#76804).
Provider-Specific Protocol Gaps
Several high-severity bugs affect specific LLM providers:
- Gemini 3.x: Silent hangs occur during sub-agent flows because
thoughtSignatureis dropped during cross-provider replay (#77566). - DeepSeek v4-pro: Conversations fail with 400 errors when
reasoning_contentis not passed back consistently while thinking is disabled (#74374). - Gemma 4: Context corruption occurs in LM Studio endpoints when reasoning blocks are not stripped from replay history (#77275).
Action Required
High Severity / Blocked
- Fix
resolvePluginContractApiPath: This is a critical blocker for all externalized channel plugins using SecretRefs. It must be updated to search thedist/directory (#77241). - Address
pdf-toolBlocking: The 10s synchronous initialization of the PDF tool must be converted to lazy-loading or cached to prevent systemic gateway stalls (#77204). - Repair WebChat Persistence: The destructive
writeFilecycle inmirrorCodexAppServerTranscriptmust be replaced with a true append mechanism to stop the loss of session history (#77012).
Contributor Attention Needed
- Audit
process.on('beforeExit'): A memory leak is causing RSS growth andMaxListenersExceededWarningin long-running processes (#77297). - Implement
announceTarget: "parent": To enable true multi-agent orchestration, sub-agent completion announces need to be routable to the parent session rather than just the channel (#27445). - Fix
doctor --fixAuth Deletion: The doctor tool should be prevented from silently deletingauth.profilesentries that are still referenced by active fallback chains (#77400).