Articles in the ‘Performance’ Category

A Primer on Preconnect

The best thing about technology, is that it’s always changing. I learn new things every day, even in areas that I think I know a lot about, such as web performance.

See, when I was looking at performance optimisations for a project, I found a way to reduce connection times by getting the browser to speculatively perform DNS lookups for a list of third-party domains. This takes about 10-30ms off of the initial connection to a new domain. It may not sound like much, but it can make a big difference to how fast an image loads.

The last time I looked at this was before the internet went all-in on HTTPS, and so when I looked at the network timing of a new third-party HTTPS service, I was wondering if there was anything I could do to help the browser with the additional TLS negotiation and connection.

Waterfall showing DNS prefetch on a HTTPS resource

The affect shown here is achieved by adding a meta tag of <meta rel=”dns-prefetch” href=”//example.com”> to the <head> of the HTML. You can also do this by adding it as a header to the response of the HTML document.

That fetches the DNS, but not the TLS negotiation or the connection setup. There is a different hint that you can give to the browser that will do all of those things, called preconnect. It works much in the same way as dns-prefetch but sets up the rest of the connection as well. When working, it looks like this

The code here looks like this: <meta rel=”preconnect” href=”https://example.com”>. That looks nice and easy, and you’ve just saved another fifty milliseconds on the first connection to that domain over HTTPS.

However, this is a web standard, so it’s not all that simple. Firstly, the browser support for this hint is not brilliant with no support in IE or Safari. The best thing to do for those browsers is to keep the dns-prefetch hint as well as the preconnect hint.

Secondly, there is a limit to how many you can use effectively, and that limit is 6 connections. The reasons for this go back into the aeons of ancient internet history (read: the 90s) where browsers and internet routers were most efficient with 6 open connections at any one time. With modern routers and browsers, this isn’t true, but these limits aren’t likely to be changed any time soon due to this enormous internet legacy.

Finally, there is an extra attribute that you can add to this <meta> tag called “crossorigin” which changes how preconnect makes its connections. As Ilya Grigorik explains in his post:

The font-face specification requires that fonts are loaded in “anonymous mode”, which is why we must provide the crossorigin attribute on the preconnect hint: the browser maintains a separate pool of sockets for this mode.

What that means is that some resources, such as fonts, ES6 modules or XHR, need to be accessed in a “non-credentialed fetch”, or crossorigin=”anonymous”. Otherwise, these resources won’t load and you’ll see a cancelled resource request in the network requests. the “anonymous” value is the default if just “crossorigin” is provided, so if you like shorthand, you don’t need to add the =”anonymous” part to your code.

That’s it. Preconnect is a really useful hint that can save milliseconds on those third-party requests. Give it a go.

For reference, my conversation started on Twitter with this tweet  and ending with this tweet from Yoav Weiss

HSTS – a no-nonsense guide

I’ve been playing with HTTP Strict Transport Security (HSTS, I’m late to the party as usual) and there’s some misconceptions that I had going in that I didn’t know about that threw me a bit. So, here’s a no nonsense guide to HSTS.

The HSTS Header is pretty simple to implement

I actually thought that this would be the hard bit, but actually putting the header in is very simple. As it’s domain specific you just need to set it at the web server or load balancer level. In Apache, it’s pretty simple:

Header always set Strict-Transport-Security "max-age=10886400;"

You can also upgrade all subdomains using this header

A small addition to the header auto-upgrades all subdomains to HTTPS, making it really simple to upgrade long-outdated content deep within databases or on static content domains without doing large-scale migrations.

Header always set Strict-Transport-Security "max-age=10886400; includeSubdomains;"

Having a short max-age is good when you’re starting out with subdomains

Having a short max-age is bad in the long-run

If you have a max-age length shorter than 18 weeks then you are ineligible for the preload list.

Wait, what?

There’s a preload list – browsers know about HSTS-supported sites

It turns out that all of the browsers include a “preload” list of websites that support HSTS, and will therefore always point the user to the HTTPS version of the website no matter what link they have come from to get there.

So, how does it work?

Well, you go to https://hstspreload.appspot.com and submit your website to the list. Chrome, Firefox, Opera, IE 11 (assuming you got a patch after June 2015), Edge and Safari pick it up and will add you to a list to always use HTTPS, which will take away a redirect for you. There are four other requirements to meet – have a valid cert (check), include subdomains, have a max-age of 18 weeks, add a preload directive and some redirection rules.

Header always set Strict-Transport-Security "max-age=10886400; includeSubDomains; preload"

Here are the standard redirection scenarios for a non-HSTS site that uses a www subdomain (like most sites):

  1. User enters https://www.example.com – no HSTS. There are 0 redirects in this scenario as the user has gone straight to the secure www domain.
  2. User enters https://example.com – no HSTS. There is 1 redirect as the web server adds the www subdomain
  3. User enters example.com – no HSTS. There is 1 redirect here as the web server redirects you to https://www.example.com in 1 hop, adding both HTTPS and the www subdomain
How to Redirect HTTP to HTTPS as described by Ilya Grigorik at Google I/O in 2014

How to Redirect HTTP to HTTPS as described by Ilya Grigorik at Google I/O in 2014

This is the best-practice for standard HTTPS migrations as set out in HTTPS Everywhere as Ilya Grigorik shows us that scenario 3 should only have 1 redirect, otherwise you get a performance penalty.

HSTS goes against this redirection policy… for good reason

To be included on the preload list you must first redirect to HTTPS, then to the www subdomain:

`http://yell.com` (HTTP) should immediately redirect to `https://yell.com` (HTTPS) before adding the www subdomain. Right now, the first redirect is to `https://www.yell.com/`.

This felt incredibly alien to me, so I started asking some questions on Twitter, and Ilya pointed me in Lucas Garron‘s direction

Following that link I get a full explanation:

This order makes sure that the client receives a dynamic HSTS header from example.com, not just www.example.com

http -> https -> https://www is is good enough to protect sites for the common use case (visiting links to the parent domain or typing them into the URL bar), and it is easy to understand and implement consistently. It’s also simple for us and other folks to verify when scanning a site for HSTS.

This does impact the first page load, but will not affect subsequent visits.
And once a site is actually preloaded, there will still be exactly one redirect for users.

If I understand correctly, using HTTP/2 you can also reuse the https://example.com connection for https://www.example.com (if both domains are on the same cert, which is usually the case).

Given the growth of the preload list, I think it’s reasonable to expect sites to use strong HSTS practices if they want to take up space in the Chrome binary. This requirement is the safe choice for most sites.

Let me try to visualise that in the scenarios:

  1. First visit, user types example.com into their browser. They get a 301 redirect to https://example.com and receive the HSTS header. They are then 301 redirected to https://www.example.com. 2 redirects
  2. Second visit, the browser knows you’re on HSTS and automatically upgrades you to HTTPS before the first redirect, so typing yell.com into the browser performs 1 redirect from https://example.com to https://www.example.com
  3. If you’re in the preload list, the second visit scenario happens on the first visit. 1 redirect

So, that makes sense to me. In order to set the HSTS upgrade header for all subdomains, it needs to hit the naked domain, not the www subdomain. This appears to be a new requirement to be added to the preload list, as the Github issue was raised on May 19th this year, and Lucas has said that this new rule will not be applied to websites that are already on the list (like Google, Twitter etc).

For me, this takes away much of the usefulness of HSTS, which is meant to save redirects to HTTPS by auto-upgrading connections. If I have to add another redirect in to get the header set on all subdomains, I’m not sure if it’s really worth it.

So, I asked another question:

And this is the response I got from Lucas

So it helps when people type in the URL, sending them to HTTPS first. This takes out the potential for any insecure traffic being sent. Thinking of the rest of the links on the internet, the vast majority of yell.com links will include the www subdomain, so HSTS and the preload list will take out that redirect, leaving zero redirects. That’s a really good win, that Lucas confirmed.

Summary – HSTS will likely change how you perform redirects

So, whilst this all feels very strange to me, and goes against the HTTPS Everywhere principles, it will eventually make things better in the long run. Getting subdomains for free is a great boost, though the preload list feels like a very exclusive club that you have to know about in order to make the best of HSTS. It’s also quite difficult to get off the list, should you ever decided that HTTPS is not for you as you’ll have the HSTS header for 18 weeks, and there is no guarantee that the preload list will be modified regularly. It’s an experiment, but one that changes how you need to implement HSTS.

So, that’s my guide. Comments, queries, things I’ve gotten wrong, leave a comment below or on Twitter: @steveworkman

AMP – Long-term Harmful

You may have heard about a project that Google has just announced called AMP (Accelerated Mobile Pages), which aims to speed up the mobile web.

Publishers around the world use the mobile web to reach these readers, but the experience can often leave a lot to be desired. Every time a webpage takes too long to load, they lose a reader—and the opportunity to earn revenue through advertising or subscriptions. That’s because advertisers on these websites have a hard time getting consumers to pay attention to their ads when the pages load so slowly that people abandon them entirely.

I completely understand the need for this project – the web is slow and advertising is such a key revenue source for publishers that these adverts are ruining the experience. Facebook’s Instant project is designed to do much the same thing. AMP is not a new idea.

Technically, AMP builds on open standards and allows you to simply replace some parts of your build and some components of your system with AMP versions of those pages. What happens with these pages is the secret sauce – Google knows of these pages, and if they pass the AMP tests it caches versions of those sites on its own servers.

This isn’t too scary – Google has committed to allow  the major advertising networks and pixel tracking so that businesses can maintain their precious statistics. AMP, in general, is a good idea.

Except for one thing.

If you’re not on it, you’re screwed.

Have a look at these two search pages. Both are searches for “Obama” on a Nexus 5 (emulated). The one on the right is the current search page, the one on the left is with AMP articles.

AMP search page comparison

Notice where the first organic listing is in each page. Currently, it’s just below the first page, but with AMP, it’s over two screens of content away! That’s absolutely massive, and for anyone not in the top 3 results, they may as well not be there.

Basically, if you’re not an AMP publisher, expect your traffic to drop, by double-digit percentages.

Short-term benefits, Long-term harm

Short-term, AMP will make publishers sit up and take notice of web performance, it’ll make websites faster across the board, and make more customers use Google. For users, this is a win/win situation.

I’ve come to the conclusion that this project is harmful to the web in the long term. If a publisher isn’t getting traffic because it’s all going to AMP publishers, then the amount of content on Google drops, quality drops and is put into these large content producers. You end up with less diversity in the web, because small producers can’t make money without using AMP. Even in the end when you’ve only got a few publishers who can survive, they will be encouraged to reduce adverts, or be limited to the subset that Google’s DoubleClick deem performant enough to be included in the AMP-friendly set of advertisers. This may block the big-money adverts, which keep these sites going.

It also creates dependencies on Google, and whilst it’s unlikely to go down any time soon, being cached on their servers and allowing for only the most basic tracking mechanism, and only for browsers that Google supports. It creates a two-tier system that the web is firmly against.

This project seems to polyfill a developer/organisation’s lack of time to dedicate to performance. Many very clever people are working on this education problem (hello to Path to Perf, Designing for Performance and PageSpeed) though it will take time. Short-term fixes like AMP and Facebook Instant are encouraging developers to take shortcuts to performance, handing their problems off to Google. This does nothing for the underlying issues, but with Google giving AMP such prominence in its search results, how can publishers resist?

Or is it a temporary solution

I hope that from all of this, developers sit up and take notice of web performance; improve their sites and provide a better experience for all. If we don’t, Google will keep solutions like this around, leaving the only content that gets interaction as the content that Google approves of – and no one wants that.

Data-Driven Performance Breakout at Edge Conference

I was lucky enough to attend Edge Conf in London this year, a day that I always truly enjoy. The main sessions of the conference were streamed live and videos will be available later, but the break-outs weren’t recorded. These were the sessions I enjoyed the most and it’s a shame that people won’t see them without being there – so here’s my notes on what was said to the best of my ability (and with a big hat tip to George Crawford for his notes). Patrick Kettner was the moderator.

Q: How can we use the masses of data that RUM collects to get businesses to care about performance?

Business leaders like metrics from companies that they can relate to (i.e. Amazon, eBay) but these aren’t very useful metrics as the scale is completely different. Finding stats from competing or relevant companies is hard, so how do you make them care?

Introducing artificial slowness is one way to convince people, but not good for business. There’s also the risk that you may not see increase in conversion from speed improvements! Filmstrips are incredibly useful at this point to see what’s going on and these are available in Chrome Dev tools in the super secret area.

Showing videos to business people makes it really hit home – people hate it when they can visibly see their site suck. It’s like making people watch a user test for their site. Shout out to Lara Hogan at Etsy (their engineering blog is awesome) for their great work on this, something that Yell has copied.

Metrics that are useful: first render, SpeedIndex, aren’t available in the browser. Using SpeedCurve can really make business people sit up and take notice of performance because it’s a pretty interface to those things.

All-in-all, the standard metrics are unlikely to be the best for you, so add in user timing markings (and a very simple polyfill) and graph those, including sending them to WebPageTest so you can measure the things that are important to you over time. This was done very successfully by The Guardian (hat tip Patrick Hamann).

Q from Ilya Grigorik: The browser loading bar is a lie, yet users wait for it. What metric should it use?

Basically, developers can put their loading after the onLoad event to hack around the loading spinner. If we stop the spinner at first render, it’s not usable. If we stop it at when the page can be interacted with when would that be? The browser runs the risk of “feeling slower” or “feeling faster” by just changing the progress bar. Apparently there’s one browser that just shows the bar for three seconds, nothing more.

No real consensus was reached here, but it was a very interesting discussion

Q: Flaky or dropped connections are important to know about for performance metrics – what can the room say about their experiences gathering offline metrics?

When the FT tried this with their web app they often exceeded localStorage sizes and sometimes POST sizes (25MB) as users could be offline for a week or more. The Guardian had good success with bundling beacons up into one big post to save money with Adobe Omniture/SiteCatalyst.

The best solution is the Beacon API (sendBeacon) which promises to deliver the payload at some point (which images/XHR don’t right now). It’s implemented in Google Analytics, you just have to enable it in the config, other tracking providers don’t have it right now.

Q: What metrics APIs are missing in browsers?

A unique opportunity to ask Ilya to add APIs into Chrome – not to be passed up

  • Frame Timing API – requested as an ES7 observable (which is unlikely).
  • Performance Observer – a subscribable stream of events that will need processing to be useful. This will give accurate frame-rate
  • Network error logging API – could work like an error reporter that posts to a configurable second origin (via a header like CSP)
  • JavaScript runtime errors without hacking window.onError
  • SpeedIndex, or a proxy for it. There’s a script for this already but it’s not massively accurate. Standardising SpeedIndex would be great.
  • First Paint – according to Ilya it’s not possible and quite subjective browser-to-browser

Wrap-up

I’d have loved to stay and chat more (nice to meet Tim Kadlec in person, shout out to the Path to Performance podcast as well), it’s rare to have a lot of the web performance community in the same room at the same time and should definitely happen more often.

If there’s things I’ve missed, let me know in the comments or on twitter (@steveworkman)