Automating SSL certificate renewal with Let's Encrypt through an ACME client like Certbot is an invaluable tool for managing the burden of system administration. However, even with automated setups, renewal failures and expired certificates do happen. I'll cover most of the common pitfalls below and, crucially, how to prevent them.
Reason 1. Misconfigured Cron Jobs or Systemd Timers
Certbot relies on cron jobs or systemd timers (depending on your system) to handle automatic renewals. These are known to fail silently, either because they were never configured in the first place, or because they were accidentally modified by something like a system update.
Solution: Periodically confirm scheduled tasks:
systemctl list-timers certbot.timer
# or
crontab -l | grep certbot
Note that some distributions, like Amazon Linux, ship without a cron daemon, which will silently cause cron-based automated renewals to fail. I found this out the hard way.
You can install a cron daemon like cronie
using the appropriate command for your distribution:
sudo yum install cronie
Reason 2. Permission Issues
Certificate renewal requires Certbot to read and write certain directories. It's easy to inadvertently lock down a directory during day to day system administration.
Solution: Ensure proper permissions for Certbot directories:
sudo chmod -R 755 /etc/letsencrypt/
Reason 3. Not Serving the acme-challenge
File Over HTTP
When Let's Encrypt tries to auto-renew your SSL certificate, it performs a validation check by making a regular HTTP request for a file called acme-challenge
in the .well-known
directory at the root of your domain to verify your control over that domain. If your server isn't configured to serve plaintext files over regular HTTP at that location, renewal will fail (assuming you're using the more common http
challenge method, rather than the dns
challenge.)
Solution: Make sure the directory from which you serve files, e.g. /var/www
for Apache, is able to have a .well-known directory created while the server is running, and that your server will successfully serve a file in that directory over regular HTTP. If you generally redirect all HTTP traffic to HTTPS, as you should, you'll have to make an exception for that particular directory.
You can test your configuration using Certbot's --dry-run
option:
sudo certbot renew --dry-run
Reason 4. Network Configuration Issues
Network misconfigurations, firewall rules, or server security groups blocking Let's Encrypt's validation requests will cause renewal attempts to fail. This is more likely to be an issue on a corporate or otherwise private network, but you won't be able to renew your certificates with the normal http
challenge procedure if Let's Encrypt's servers can't reach yours.
Solution: Let's Encrypt does not publish a fixed list of IP addresses for their validation servers because these addresses change frequently, so you have to configure your network to allow inbound HTTP traffic on port 80 from any address. If stricter rules are needed, consider using the DNS-based validation method (though this can be more difficult to automate.) Again, a --dry-run
can be used to verify whether an actual renewal can succeed.
Preventing Future Failures
Even the best-configured setups will experience occasional issues. Proactive certificate monitoring and alerting are crucial. Always use specialized SSL monitoring tools such as CertNotifier to receive alerts well before renewal issues affect your service. Don't let automated renewals lull you into a false sense of security.