Debugging the Occasional ECONNRESET: When Unread Data Triggers TCP Resets
Intermittent network errors are among the most frustrating bugs to debug. One of the most common yet misunderstood errors is ECONNRESET ("Connection reset by peer"). While it often suggests a crash or a network outage, the root cause is frequently more subtle: a mismatch in how two services handle the closing of a TCP connection.
This article examines a specific scenario where a server sends a response and closes the connection, but the client receives an ECONNRESET instead of a clean termination. By analyzing a controlled reproducer and a real-world production failure involving Nginx and Gunicorn, we can uncover the mechanics of the TCP Reset (RST) packet.
The Mystery of the Intermittent Reset
Consider two services running on the same machine. A server opens a listening TCP socket on localhost, and a client connects to it. They exchange data normally. However, every now and then, the client receives an ECONNRESET while reading data from the socket, even though the server logs show no crashes and no errors.
To investigate this, a "lab" reproducer was built: a server that dumps 600,000 bytes to a client upon connection and then immediately calls close().
Under normal conditions, the client reads the data and the connection closes cleanly. But when the client is modified to "spam" the server (sending some data to the server before attempting to read), the behavior changes. The client's recv() call begins returning -1 with errno set to 104 (ECONNRESET).
Analyzing the Wire and the System Calls
Using tcpdump, it becomes clear that a TCP RST packet is actually being sent from the server to the client. This is surprising because the server's strace shows a perfectly normal execution flow:
accept()a connection.sendto()the 600,000 bytes.close()the socket.exit_group(0)(clean exit).
From the server's perspective, all data was sent successfully. However, the client's strace reveals that it only manages to read a portion of the data before the ECONNRESET hits.
The Hypothesis: The "Dirty" Socket
By introducing a sleep(1) before the close() call in the server, the ECONNRESET disappears. This suggests a timing issue related to the closing of the socket.
The hypothesis is that when the server calls close(), there is still data pending in the server's receive buffer (the "spam" sent by the client). In TCP, if a process closes a socket that still has unread data in its receive queue, the stack sends an RST packet to the peer instead of the standard FIN packet used for a graceful shutdown.
Real-World Application: Nginx and Gunicorn
This theoretical behavior manifests in production environments, specifically in a stack involving Nginx as a reverse proxy and Gunicorn serving a Flask application.
In this scenario, Nginx sends an HTTP request to Gunicorn using two writev() calls: one for the headers and one for the body. Gunicorn reads the headers, but if the application is "lazy" and doesn't explicitly access the request body, Gunicorn may never call recv() for that remaining data.
When Gunicorn finishes processing the request, it sends the response and calls close(). Because the request body is still sitting unread in the socket buffer, the kernel triggers a TCP RST. Nginx, receiving this RST, logs an ECONNRESET.
The Solution
To resolve this, the application must ensure that the entire request body is read from the socket before the connection is closed. In the Flask/Gunicorn example, performing a dummy operation on the HTTP body forces the read, clearing the buffer and allowing for a graceful FIN closure.
Technical Deep Dive: Why the RST?
This behavior is not a bug, but a defined part of the TCP specification. As noted in RFC 1122 and RFC 2525, a host should send an RST if a CLOSE call is issued while received data is still pending.
A host MAY implement a "half-duplex" TCP close sequence... If such a host issues a CLOSE call while received data is still pending in TCP, or if new data is received after CLOSE is called, its TCP SHOULD send a RST to show that data was lost.
Why not just send a FIN?
As discussed in the community, sending a FIN (the standard 4-way handshake) when data is still pending would be misleading. It would imply that all data was successfully delivered and processed, when in fact, some data was lost. The RST is a signal to the peer that the data it sent was not fully consumed by the application.
Key Takeaways for Developers
- Read everything: If you are implementing a protocol where both sides send data, ensure the receiver reads all available data before closing the socket.
- Graceful Shutdowns: Consider using
shutdown(SHUT_WR)to signal that you are done sending data while still allowing the socket to be read until EOF. - Beware of "Lazy" Frameworks: High-level frameworks may abstract away socket reads. If you notice intermittent
ECONNRESETerrors in a reverse proxy, check if your application is ignoring the request body. - Buffer Limits: When forcing a read of the request body to prevent RSTs, be mindful of
client_max_body_size(in Nginx) to prevent Denial of Service (DoS) attacks via massive request bodies.