← Back to Blogs
HN Story

The Hidden Danger of systemd-resolved in Automated Certificate Management

May 13, 2026

The Hidden Danger of systemd-resolved in Automated Certificate Management

The stability of modern web infrastructure often relies on the silent operation of automated tools. Caddy, for instance, is widely praised for its ability to handle SSL/TLS certificates automatically via Let's Encrypt or ZeroSSL. However, as a recent incident highlighted by Paul Houle, a certificate expiration can occur even when the system appears healthy, stemming from a subtle failure in the underlying DNS resolution provided by systemd-resolved.

The Failure Chain: From DNS to Expired Certificates

When a web server like Caddy manages certificates, it must communicate with a Certificate Authority (CA). This process requires reliable DNS resolution to locate the CA's API endpoints. If DNS resolution fails, Caddy cannot request a renewal, and the certificate eventually expires.

In this specific case, the issue was not a total network outage, but a "selective" breakage. This is the most dangerous type of failure: the system continues to function for most tasks, but fails on specific requests. When systemd-resolved encounters certain conditions—such as specific configuration mismatches or bugs—it may fail to resolve certain external domains while continuing to resolve others, or failing to resolve the local hostname in a way that confuses the application.

The systemd-resolved Controversy

The incident has sparked a broader discussion among systems engineers regarding the reliability of systemd-resolved and its related networking components. While systemd is an industry-standard service manager, there is a significant divide on whether its scope should have extended into networking.

One critical point raised by community members is the potential for systemd-resolved to interfere with the server's own hostname resolution. A known bug in systemd can cause issues when the server's hostname is identical to the domain it is serving. This can be mitigated by creating a service override to disable hostname synthesis:

# /etc/systemd/system/systemd-resolved.service.d/override.conf
[Service]
Environment=SYSTEMD_RESOLVED_SYNTHESIZE_HOSTNAME=0

Alternative Networking Strategies

For those who prioritize absolute reliability in server DNS, some experts suggest moving away from the systemd networking stack entirely. The argument is that systemd-resolved is designed more for client-side flexibility (like handling frequently changing network connectivity) than for the rigid stability required by a server.

Recommended Alternatives

  • Unbound: A validating, recursive, caching DNS resolver. It is often recommended for servers because it provides more control and stability, though it may require manual restarts or cache flushes when network attachments change.
  • Direct Resolver Configuration: Bypassing systemd-resolved by managing /etc/resolv.conf directly or using a dedicated DNS daemon that does not attempt to "synthesize" or manage the network state dynamically.

Conclusion

Automated certificate management is a powerful tool, but it creates a dependency on the underlying OS networking stack. When that stack fails selectively, the result is a silent failure that leads to a visible outage. To ensure high availability, administrators should not only monitor the certificate expiration dates but also monitor the DNS resolution capabilities of the services responsible for renewing them.

References

HN Stories