Towards Universal HTTPS Adoption
HTTPS was invented in 1994. We are now on the verge of a milestone: it is estimated that by the end of 2016, adoption will reach 50%. But how the hell has it taken more than two decades to secure half of the internet?
HTTP Considered Harmful
When HTTPS was implemented, it was widely thought that encryption was only necessary in rare cases, such as financial transactions. Indeed, it is not immediately obvious that most internet traffic should be encrypted. But the internet has become far broader than the original implementers expected. People post their inner thoughts and secrets on message boards. They learn about drugs, about sexuality, about politics. They watch porn, revealing their sexual orientation. All of these activities have the potential to destroy friendships, relationships, jobs and families. And even if these activities aren't illegal now, they may become so (potentially ex post facto[0]) when someone changes countries, or when their country changes on them.
Unencrypted traffic also poses security threats for encrypted traffic. Someone's password for their gardening forum is likely to be their password for their email. When they click on a link on a news site[1] they trust, and it takes them to what looks like their bank's website, they don't bother to double-check. Third parties are left vulnerable too—injected JavaScript can make them an unwitting participant in a DDoS attack[2].
Obstacles to HTTPS Adoption
The main obstacle to HTTPS adoption is a small set of browser security policies:
- Browsers will show a warning for "safe" mixed content, mainly images, which looks less secure than plain HTTP.
- Browsers will block mixed content CSS and JavaScript, breaking the webpage unless the user intervenes, and show an error, again making the user feel less secure than plain HTTP.
- Browsers will strip the
Referer
header when making a mixed content request, in particular for ads served over HTTP, reducing ad revenue.
Further note that, even when a website has successfully addressed mixed content, the mere threat of a warning or error can dissuade a sysadmin (or their boss) from deploying HTTPS.
These policies made sense to the original implementers given their assumptions about how HTTPS would be used. Any financial transaction was simply expected to be performed over HTTPS, so the downside of reduced adoption didn't come up. Surely the user should be warned if images on the page could be spoofed, hence the noticeable warning for mixed content. Loading JS or CSS insecurely allows an active attacker to compromise the page entirely, so is totally inappropriate for financial transactions and should fail by default. And sending the referer[3] with insecure requests could leak sensitive data in the path[4], it should definitely be stripped in these cases.
But for most websites the benefit is indeed minor.
Allowing mixed content CSS or JavaScript lets an active attacker[5] entirely compromise the webpage,
but allowing mixed content images gives them little practical advantage over a passive attacker, while sending the Referer
header gives them none.
Meanwhile these browser behaviors offer relatively little extra protection against passive attacks,
which, in an era of ubiquitous data collection, are far easier to perform and hence of far greater concern for most websites[6].
Mixed content warnings are almost always triggered by static images, which are rarely[7] sensitive.
Eavesdropping on CSS and JavaScript requests may reveal the specifc page a user is on within a website, it does not reveal the page's user-specific content[8].
Thus the impact of allowing such mixed content is similar to passing the Referer
header, which directly reveals which page the user is on.
No-Security-Claim
There should be a way for websites to provide improved security without being burdened by these policies. This is not to say that browsers should never enforce them; as discussed above, they are entirely appropriate for financial transactions and other high-value targets for active attacks. It would not even be appropriate to show the usual HTTP icon on such pages when one of these policies are violated, because doing so would reduce the incentive to make sure such violations never occur. But I think a middle ground exists which would be radically better than the status quo.
There has been some progress along these lines. The W3C defines an unsafe-url
referrer policy
which tells the browser to send the referer with plain HTTP requests, as if the page were loaded over HTTP itself.
Unfortunately, referrer policies are not supported by IE, rendering it effectively useless for most pages.
I propose a broader, simpler solution: a No-Security-Claim
header.
If a page is sent with the No-Security-Claim
header, it should be presented as if it were accessed over HTTP, even when accessed over HTTPS, and without regard to mixed content.
Additionally, it should send the referer as if accessed over HTTP.
This would allow websites to provide the partial security discussed above at much lower cost.
Honestly, I am not hopeful that any solution would ever be ported to old browsers (i.e. IE), and so universal HTTPS adoption is probably at least a decade off.
The best time for this proposal would have been 20 years ago.
But I hold out some slim hope that implementing No-Security-Claim
would be a simple matter of changing a few conditionals, and so could sneak into a security update.
Until then, the battle continues.
- ^
In law, "ex post facto" refers to a law that is applied to actions that predate the passage of the law. While this is illegal in many countries (including the United States), it is not uncommon historically, and like any law, a law against ex post facto laws can be revoked.
- ^
The vast majority of news sites are currently HTTP-only.
- ^
See the CloudFlare blog for an explanation of JavaScript-based DDoS attacks, and Netresec's analysis of China's use of this and other attacks against GitHub.
- ^
The attentive reader may note that "referer" is not spelled consistently throughout. This is intentional. When referring to the
Referer
header, it is spelled "referer", as this is how the header is spelled in the HTTP specification and in every major implementation. However, the W3C's referrer policy specification uses the standard English spelling "referrer", and so that spelling is used in reference to said specification.[9] - ^
Defense in depth aside, putting sensitive data in the path is a terrible idea. Paths are shared casually, often via URL shortening services, which make them brute-force enumerable.
- ^
Internet traffic is vulnerable to two different kinds of attacks: passive attacks, which just read the data being sent over the network, and active attacks, which block traffic being sent over the network and substitute it with their own. Performing an active attack requires having full control over one of the computers on the route between the user and the server, while a passive attack requires only the ability to steal data from one these computers, possibly after the fact, or the ability to surreptitiously tap into a fiber-optic cable.
- ^
In fact, thanks to Edward Snowden, we know that the Five Eyes agencies are performing such attacks routinely and at a massive scale. And they are probably not alone.
- ^
Sexually explicit images being an exception.
- ^
This is assuming these are requests for static resources. Dynamic resources such as JSON should almost always be requested from servers under the control of the website's owner, and so if the page can be served over HTTPS, it should be easy to serve them over HTTPS too.
- ^
The use of the word "refer" in discussing the spelling of "referer" is also intentional, just in case the facts of the matter did not make you sad enough.