The best thing about technology, is that it’s always changing. I learn new things every day, even in areas that I think I know a lot about, such as web performance.
See, when I was looking at performance optimisations for a project, I found a way to reduce connection times by getting the browser to speculatively perform DNS lookups for a list of third-party domains. This takes about 10-30ms off of the initial connection to a new domain. It may not sound like much, but it can make a big difference to how fast an image loads.
The last time I looked at this was before the internet went all-in on HTTPS, and so when I looked at the network timing of a new third-party HTTPS service, I was wondering if there was anything I could do to help the browser with the additional TLS negotiation and connection.
The affect shown here is achieved by adding a meta tag of <meta rel=”dns-prefetch” href=”//example.com”> to the <head> of the HTML. You can also do this by adding it as a header to the response of the HTML document.
That fetches the DNS, but not the TLS negotiation or the connection setup. There is a different hint that you can give to the browser that will do all of those things, called preconnect. It works much in the same way as dns-prefetch but sets up the rest of the connection as well. When working, it looks like this
The code here looks like this: <meta rel=”preconnect” href=”https://example.com”>. That looks nice and easy, and you’ve just saved another fifty milliseconds on the first connection to that domain over HTTPS.
However, this is a web standard, so it’s not all that simple. Firstly, the browser support for this hint is not brilliant with no support in IE or Safari. The best thing to do for those browsers is to keep the dns-prefetch hint as well as the preconnect hint.
Secondly, there is a limit to how many you can use effectively, and that limit is 6 connections. The reasons for this go back into the aeons of ancient internet history (read: the 90s) where browsers and internet routers were most efficient with 6 open connections at any one time. With modern routers and browsers, this isn’t true, but these limits aren’t likely to be changed any time soon due to this enormous internet legacy.
Finally, there is an extra attribute that you can add to this <meta> tag called “crossorigin” which changes how preconnect makes its connections. As Ilya Grigorik explains in his post:
The font-face specification requires that fonts are loaded in “anonymous mode”, which is why we must provide the crossorigin attribute on the preconnect hint: the browser maintains a separate pool of sockets for this mode.
What that means is that some resources, such as fonts, ES6 modules or XHR, need to be accessed in a “non-credentialed fetch”, or crossorigin=”anonymous”. Otherwise, these resources won’t load and you’ll see a cancelled resource request in the network requests. the “anonymous” value is the default if just “crossorigin” is provided, so if you like shorthand, you don’t need to add the =”anonymous” part to your code.
That’s it. Preconnect is a really useful hint that can save milliseconds on those third-party requests. Give it a go.
I’ve been playing with HTTP Strict Transport Security (HSTS, I’m late to the party as usual) and there’s some misconceptions that I had going in that I didn’t know about that threw me a bit. So, here’s a no nonsense guide to HSTS.
The HSTS Header is pretty simple to implement
I actually thought that this would be the hard bit, but actually putting the header in is very simple. As it’s domain specific you just need to set it at the web server or load balancer level. In Apache, it’s pretty simple:
Header always set Strict-Transport-Security "max-age=10886400;"
You can also upgrade all subdomains using this header
A small addition to the header auto-upgrades all subdomains to HTTPS, making it really simple to upgrade long-outdated content deep within databases or on static content domains without doing large-scale migrations.
Header always set Strict-Transport-Security "max-age=10886400; includeSubdomains;"
Having a short max-age is good when you’re starting out with subdomains
@steveworkman unless your env is so simple you can be *absolutely* sure there aren’t HTTP only subdomains you forgot about.
If you have a max-age length shorter than 18 weeks then you are ineligible for the preload list.
There’s a preload list – browsers know about HSTS-supported sites
It turns out that all of the browsers include a “preload” list of websites that support HSTS, and will therefore always point the user to the HTTPS version of the website no matter what link they have come from to get there.
So, how does it work?
Well, you go to https://hstspreload.appspot.com and submit your website to the list. Chrome, Firefox, Opera, IE 11 (assuming you got a patch after June 2015), Edge and Safari pick it up and will add you to a list to always use HTTPS, which will take away a redirect for you. There are four other requirements to meet – have a valid cert (check), include subdomains, have a max-age of 18 weeks, add a preload directive and some redirection rules.
Header always set Strict-Transport-Security "max-age=10886400; includeSubDomains; preload"
Here are the standard redirection scenarios for a non-HSTS site that uses a www subdomain (like most sites):
User enters https://www.example.com – no HSTS. There are 0 redirects in this scenario as the user has gone straight to the secure www domain.
User enters https://example.com – no HSTS. There is 1 redirect as the web server adds the www subdomain
User enters example.com – no HSTS. There is 1 redirect here as the web server redirects you to https://www.example.com in 1 hop, adding both HTTPS and the www subdomain
How to Redirect HTTP to HTTPS as described by Ilya Grigorik at Google I/O in 2014
This is the best-practice for standard HTTPS migrations as set out in HTTPS Everywhere as Ilya Grigorik shows us that scenario 3 should only have 1 redirect, otherwise you get a performance penalty.
HSTS goes against this redirection policy… for good reason
To be included on the preload list you must first redirect to HTTPS, then to the www subdomain:
This felt incredibly alien to me, so I started asking some questions on Twitter, and Ilya pointed me in Lucas Garron‘s direction
This order makes sure that the client receives a dynamic HSTS header from example.com, not just www.example.com
http -> https -> https://www is is good enough to protect sites for the common use case (visiting links to the parent domain or typing them into the URL bar), and it is easy to understand and implement consistently. It’s also simple for us and other folks to verify when scanning a site for HSTS.
This does impact the first page load, but will not affect subsequent visits.
And once a site is actually preloaded, there will still be exactly one redirect for users.
If I understand correctly, using HTTP/2 you can also reuse the https://example.com connection for https://www.example.com (if both domains are on the same cert, which is usually the case).
Given the growth of the preload list, I think it’s reasonable to expect sites to use strong HSTS practices if they want to take up space in the Chrome binary. This requirement is the safe choice for most sites.
Let me try to visualise that in the scenarios:
First visit, user types example.com into their browser. They get a 301 redirect to https://example.com and receive the HSTS header. They are then 301 redirected to https://www.example.com. 2 redirects
Second visit, the browser knows you’re on HSTS and automatically upgrades you to HTTPS before the first redirect, so typing yell.com into the browser performs 1 redirect from https://example.com to https://www.example.com
If you’re in the preload list, the second visit scenario happens on the first visit. 1 redirect
So, that makes sense to me. In order to set the HSTS upgrade header for all subdomains, it needs to hit the naked domain, not the www subdomain. This appears to be a new requirement to be added to the preload list, as the Github issue was raised on May 19th this year, and Lucas has said that this new rule will not be applied to websites that are already on the list (like Google, Twitter etc).
For me, this takes away much of the usefulness of HSTS, which is meant to save redirects to HTTPS by auto-upgrading connections. If I have to add another redirect in to get the header set on all subdomains, I’m not sure if it’s really worth it.
So, I asked another question:
@lgarron but we have 20 years worth of back links around the internet. Are there other benefits of the HSTS header + preload list?
So it helps when people type in the URL, sending them to HTTPS first. This takes out the potential for any insecure traffic being sent. Thinking of the rest of the links on the internet, the vast majority of yell.com links will include the www subdomain, so HSTS and the preload list will take out that redirect, leaving zero redirects. That’s a really good win, that Lucas confirmed.
@lgarron yes, and our sitemaps are correct with that link. What will happen to http versions of those (most common use case)?
Summary – HSTS will likely change how you perform redirects
So, whilst this all feels very strange to me, and goes against the HTTPS Everywhere principles, it will eventually make things better in the long run. Getting subdomains for free is a great boost, though the preload list feels like a very exclusive club that you have to know about in order to make the best of HSTS. It’s also quite difficult to get off the list, should you ever decided that HTTPS is not for you as you’ll have the HSTS header for 18 weeks, and there is no guarantee that the preload list will be modified regularly. It’s an experiment, but one that changes how you need to implement HSTS.
So, that’s my guide. Comments, queries, things I’ve gotten wrong, leave a comment below or on Twitter: @steveworkman
Over the last few months I’ve been putting together my talk for the year, based on a blog post that is titled “HTTPS is Hard”. You can read the full article on the Yell blog on which it is published. There’s also an abridged version on Medium. It’s been a very long time coming, and has changed over the time I’ve been writing it, so I thought I’d get down a few reflections on the article.
It’s really long, and took a long time to write
This is firstly, the longest article I’ve written (at over four thousand words, it’s a quarter of the length of my dissertation) and it’s taken the longest time to be published. I had a 95% complete draft ready back in September, when I was supposed to be working on my Velocity talk for October but found myself much more interested in this article. Dan Applequist has repeatedly asked me to “put it in a blog post, the TAG would be very interested” – so finally, it’s here.
The truth is that I’m constantly tweaking the post. Even the day before it goes live, I’m still making modifications as final comments and notes come in from friends that I’ve been working with on this. Also, it seems like every week the technology moves on and the landscape shifts: Adobe offers certs for free, Dreamhost gives away LetsEncrypt HTTPS certs through a one-click button, Netscaler supports HTTP/2, the Washington Post write an article, Google updates advice and documentation, and on and on and on… All through this evolution, new problems emerge and the situation morphs and I come up with new ways to fix things, and as I do, they get put into the blog post. Hence, it’s almost a 20 minute read.
Every day something silly happens. Today’s was from generally-awesome tech-friendly company Mailchimp. They originally claimed that “Hosted forms are secure on our end, so we don’t need to offer HTTPS. We get that some of our users would like this, though” (tweet has since been deleted). Thankfully, they owned up and showed CalEvans how to do secure forms.
@CalEvans We apologize for the inaccurate info from earlier, Cal. It is actually possible to use HTTPS with our hosted forms by grabbing the
Still, it’s this kind of naivety that puts everyone’s security at risk. A big thumbs up to Mailchimp for rectifying the situation.
If I were to have started today, would HTTPS still be hard?
Yes, though nowhere near as hard. We’d still have gone through the whole process, but it wouldn’t have taken as long (the Adobe and Netscaler bits were quite time-consuming) and the aftermath wouldn’t have gone on for anywhere near as long if I’d have realised in advance about the referrer problem.
If you’d have known about the referrer issue, would you have made the switch to HTTPS?
Honestly, I’m not sure I would have pushed so hard for it. We don’t have any solid evidence to say it’s affecting any business metrics, but I personally wouldn’t like the impression that traffic just dropped off a cliff, and it wouldn’t make me sign up as an advertiser. Is this why Yelp, TripAdvisor and others haven’t migrated over? Who can say…
This is why the education piece of HTTPS is so important, because developers can easily miss little details like referrers, and just see the goals of ranking and HTTP/2 and just go for it.
The point of the whole article is that there just isn’t the huge incentive to move to HTTPS. Having a padlock doesn’t make a difference to users unless they sign in or buy something. There needs to be something far more aggressive to convince your average developer to move their web site to HTTPS. I am fully in support of Chrome and Firefox’s efforts to mark HTTP as insecure to the user. The only comments I get around the office about HTTPS happen when a Chrome extension causes a red line to go through the protocol in the address bar – setting a negative connotation around HTTP seems to be the only thing that gets people interested.
What’s changed since you wrote the article?
I am really pleased to see the Google Transparency Report include a section on HTTPS (blog post). An organisation with the might and engineering power of Google are still working towards HTTPS, overcoming technical boundaries that make HTTPS really quite hard. It’s nice to know that it’s not just you fighting against the technology.
What about “privileged apps” – you don’t talk about that
The “Privileged Contexts” spec AKA “Powerful Features” and how to manage them is a working draft and there’s a lot of debate still to be had before they go near a browser. I like how the proposals work and how they’ve been implemented for Service Worker. I also appreciate why they’re necessary, especially for Service Worker (the whole thread of “why” can be read on github). I hope that Service Worker has an effect on HTTPS uptake, though this will only truly happen should Safari adopt the technology.
It looks like Chrome is going to turn off Geolocation from insecure origins very soon, as that part of the powerful features task list has been marked as “fixed” as of March 3rd. Give it a few months and geolocation will be the proving ground for the whole concept of powerful features – something that I’ll be watching very closely.
Publishers around the world use the mobile web to reach these readers, but the experience can often leave a lot to be desired. Every time a webpage takes too long to load, they lose a reader—and the opportunity to earn revenue through advertising or subscriptions. That’s because advertisers on these websites have a hard time getting consumers to pay attention to their ads when the pages load so slowly that people abandon them entirely.
I completely understand the need for this project – the web is slow and advertising is such a key revenue source for publishers that these adverts are ruining the experience. Facebook’s Instant project is designed to do much the same thing. AMP is not a new idea.
Technically, AMP builds on open standards and allows you to simply replace some parts of your build and some components of your system with AMP versions of those pages. What happens with these pages is the secret sauce – Google knows of these pages, and if they pass the AMP tests it caches versions of those sites on its own servers.
Have a look at these two search pages. Both are searches for “Obama” on a Nexus 5 (emulated). The one on the right is the current search page, the one on the left is with AMP articles.
Notice where the first organic listing is in each page. Currently, it’s just below the first page, but with AMP, it’s over two screens of content away! That’s absolutely massive, and for anyone not in the top 3 results, they may as well not be there.
Basically, if you’re not an AMP publisher, expect your traffic to drop, by double-digit percentages.
Short-term benefits, Long-term harm
Short-term, AMP will make publishers sit up and take notice of web performance, it’ll make websites faster across the board, and make more customers use Google. For users, this is a win/win situation.
I’ve come to the conclusion that this project is harmful to the web in the long term. If a publisher isn’t getting traffic because it’s all going to AMP publishers, then the amount of content on Google drops, quality drops and is put into these large content producers. You end up with less diversity in the web, because small producers can’t make money without using AMP. Even in the end when you’ve only got a few publishers who can survive, they will be encouraged to reduce adverts, or be limited to the subset that Google’s DoubleClick deem performant enough to be included in the AMP-friendly set of advertisers. This may block the big-money adverts, which keep these sites going.
It also creates dependencies on Google, and whilst it’s unlikely to go down any time soon, being cached on their servers and allowing for only the most basic tracking mechanism, and only for browsers that Google supports. It creates a two-tier system that the web is firmly against.
This project seems to polyfill a developer/organisation’s lack of time to dedicate to performance. Many very clever people are working on this education problem (hello to Path to Perf, Designing for Performance and PageSpeed) though it will take time. Short-term fixes like AMP and Facebook Instant are encouraging developers to take shortcuts to performance, handing their problems off to Google. This does nothing for the underlying issues, but with Google giving AMP such prominence in its search results, how can publishers resist?
Or is it a temporary solution
I hope that from all of this, developers sit up and take notice of web performance; improve their sites and provide a better experience for all. If we don’t, Google will keep solutions like this around, leaving the only content that gets interaction as the content that Google approves of – and no one wants that.