The world’s largest free, not-for-profit Certificate Authority (CA) – Let’s Encrypt – has been making headlines all week. A few days ago, it announced that it had issued its billionth TLS certificate – a commendable achievement.
This week, however, the tables seem to have turned.
In an unprecedented move that shook netizens, Let’s Encrypt announced that it was being forced to revoke 3 million issued certificates owing to a bug in the software that automated its CAA verification processes (more on that below). For patrons of the CA, this means that once the revocation takes effect, any of their domains or subdomains that were protected by the revoked certificates will be flagged as insecure by browsers – until they’re renewed or replaced, at least. At the time of writing this article, the revocation will have begun in earnest (20:00 UTC/3PM EST, March 4) and is expected to be completed by March 5.
Let’s Encrypt clarified that it had no choice in the matter. The tenets of incident response are stated in the Baseline Requirements – a document published by the CA Browser Forum Working group – and absolutely have to be abided with, which is why the CA was forced to hand out a short notice period.
Let’s Encrypt seems to have a handle on things…
The CA has been quite transparent about the issue so far – a blog post was made justifying the move, and informing users about the steps they had to take to mitigate the issue – they’ve already emailed every user with an affected certificate.
It has floated a list of serial numbers of the certificates in question, and made a free tool available where users can manually check if their domain has a certificate that has been revoked.
…but there is one little problem.
While everything is being handled in a coordinated fashion, this mass-revocation poses a bit of a problem to users with thousands of certificates on file. With the way Let’s Encrypt handles renewals via the ACME protocol, there is a healthy possibility that a few affected certificates might have already been renewed. BUT the fact that there could be any number of revoked, inactive certificates spread across multiple domains and servers is going to cause security personnel a lot of grief. Using the provided tool to manually locate each one of these certificates, and then getting them renewed can take a lot longer than a few hours, especially when the process has to be iterated, literally thousands of times.
Considering that a revoked certificate almost always means that a browser will disallow visitors from visiting a domain, the space of time between discovering a revoked certificate and renewing it will cost website owners dearly in terms of lost traffic and the opportunity cost that comes with it.
That’s bad, but that’s not even the worst bit. When an endpoint remains unprotected and unencrypted for a period of time, that small window is all a malicious actor would need to bury themselves into the network. Think of it as a weak link in an otherwise strong security setup. This opens doors for potential security breaches and data theft, and that’s just the tip of the iceberg.
How can this be remediated without having to suffer downtime?
It’s simple, really.
One way to circumvent this drawn-out renewal run is to use an automated certificate management tool. With a certificate lifecycle automation platform, it’s as simple as clicking on a button to run a network scan, having the tool locate every revoked/inactive certificate for you, and setting up a bulk renewal from the CA in question (Let’s Encrypt, in this case – but could also be any issuer of your choice). The process takes several minutes, at most – which is in drastic contrast to the hours you’d spend on doing it manually. Here at AppViewX, we’ve got one of those tools ourselves – we call it AppViewX CERT+, and if you’d like to give it a shot, you can do so here.
Here’s a sneak peek into our network monitoring software, which gives administrators a bird’s-eye-view into certificate renewals and statuses.
Breaking down Boulder’s CAA-check-breaking bug.
Now that we’ve told you how you can get out of this without looking too shabby, let’s talk about the root cause of this fiasco.
It’s common knowledge that Let’s Encrypt provides an automated renewal service for all its certificates. Now, several precautions and checks have been set up prior to the execution of the renewal to ensure that it is done in a secure manner. One of those checks is called the CAA, or Certificate Authority Authorization.
As part of this process, the server software the the CA uses for verification (Let’s Encrypt’s tool is called Boulder, FYI), runs a DNS check to obtain the restrictions a domain’s owner has placed upon certificate renewal. The fact that an owner can specify exactly which issuer is allowed to do so, adds an additional layer of security that thwarts hackers from attempting to bluff their way into a potential breach. In common practice, CAA records are routinely checked prior to issuance.
The bug – possibly an overlooked programming error – had a lot to do with the logic involved with checking CAA records.
It was quite elementary, really. As Let’s Encrypt described it:
The bug: when a certificate request contained N domain names that needed CAA rechecking, Boulder would pick one domain name and check it N times. What this means in practice is that if a subscriber validated a domain name at time X, and the CAA records for that domain at time X allowed Let’s Encrypt issuance, that subscriber would be able to issue a certificate containing that domain name until X+30 days, even if someone later installed CAA records on that domain name that prohibit issuance by Let’s Encrypt.
In layman’s terms, instead of running CAA checks on ten domain names, the bug would cause a check to be run on one domain ten times, resulting in certificates being issued to the nine other domains without regard for any restrictions that may have been mandated. While the CA has patched this bug, 3 Million certificates had already been issued by Boulder with the bug in effect – which, predictably, forced Let’s Encrypt to revoke all of them.
However, the CA’s blog post also clarified that a million of those certificates were actually duplicates of other affected certificates, placing the net value of affected entities at 2 million. In conclusion, this event is unlikely to have long-term devastating effects (think Equifax), but for users, it serves as a wake-up call – to have systems in place that instantaneously remediate such issues by automating renewal, revocation, and installation of their TLS certificates.
And that system is called a Certificate Lifecycle Automation platform. If you’d like to give ours a try, visit www.appviewx.com or click here for a personalized demo or a free trial.