Posts Tagged ‘HTML5’

HTML5 in your blog

Tuesday, February 16th, 2010

This is the second part of a series on the construction of my blog. In this post, I’ll be focusing on the underlying HTML5 markup.

When I started making this blog, I had a decision about what language to use underneath. The choices were simple: HTML4, XHTML 1.0, or HTML5. The first two options represent the old school of HTML, both based on standards from 1999, with tags which have worked for years and are well understood and supported across all browsers. What these standards fall down on are their semantic tags.

Since table layouts disappeared, site content has been divided by large blocks, created using <div> tags.  Whilst having a single multi-purpose tag has been very useful, this tag has become littered around websites, trying to find ways to semantically structure their data.

HTML5 Tags

HTML5 brings a new set of structural tags for designers and developers to use in their sites. These tags are:

  • <header>
  • <nav>
  • <section>
  • <article>
  • <aside>
  • <footer>
  • <figure>

These semantic elements can all be used in place of <div> tags, as they are all “block-level” elements. For an example of how they can be used, let’s take a look at the front page of my blog.

Steve Workman blog layout

It all looks like the blog has a firm structure, that could quite easily be <div> tags. Look closer and you’ll see that there isn’t a single <div> tag defining general structure. Only the <article> and <section> tags have been repeated at any point in the page. Take a look at this next screenshot, without the content visible:

Steve Workman's blog structureThe layout is very simple, making full use of the HTML5 tags. The advantages here are clear:

  • For screen readers, users can skip straight past the <header> and <nav>, and locate <articles> simply by using keyboard shortcuts or having screen readers jump to articles
  • For search engines, all <articles> are marked up accordingly, making it easier to determine what the actual content of the site is
  • The same goes for the <nav>, providing hard links for the main parts of the site.

There’s other benefits too. This blog uses WAI-ARIA to identify the role of each of the sections. Using this, the content section is marked up differently to the toolbox section where my Twitter and Last.fm links are displayed.

It’s a similar story on a single article page. Making use of the new tags makes reading an article easier and more clearly defined:

Steve Workman's blog single page layoutThe single page template makes use of the <aside> tag to mark up the sidebar as inconsequential information.

Browser issues

As usual, IE has problems rendering these new tags. Thankfully, Remmy Sharp has an very useful bit of JavaScript that makes everything work nicely. IE then acts just like the other browsers. Problem solved.

HTML5 Video

Structural tags are one of the only items currently implemented in most modern web browsers. Other technologies such as web databases and drag & drop are still a long way off. What isn’t a long way off is HTML5’s <video> tag – native video in the browser. Youtube and Vimeo are already implementing this so you can too. It’s simple enough, just encode your videos in H.264 (for webkit browsers) and Ogg (for Firefox and Opera) and you’re good to go. If you want to know a bit more about the H.264/Ogg debate, read Bruce Lawson’s column.

There are lots more benefits to HTML5 that I’ll show as I add them to the blog (like the <mark> element!).

I hope you’ve found this useful. The accessibility benefits of HTML5 are excellent, and I hope you’ll be upgrading your blog soon.

HTML 5 Forms – a spammers paradise

Monday, January 18th, 2010

HTML 5 form spam
Did you know, HTML 5, the spec that will be completed in 2022, but with some bits available now, will have a whole new set of form elements designed to make complex forms available natively from the browser. I’ve been to a few talks where Opera’s Bruce Lawson has demoed and talked about these upcoming features that have been implemented in the Opera browser. From an accessibility standpoint it looks great; no longer will screen readers have to rely on labels to infer the type of data to be entered into forms. From a developer’s standpoint, you won’t have to code javascript date pickers any more, nor have to rely on javascript for validation.

So, all of this makes it easier to enter data on the web, a great thing. I asked the question this morning, “who enters the most data on the internet?”. The answer is spammers. It is generally thought that 90% of all e-mail sent is spam, and a quick glance at my blog’s spam counter sees 7,300 fake comments caught compared to 56 real comments.

So, why will HTML 5 forms be such a problem? Well, at the moment, spammers use automated tools to crawl the internet, looking for forms to fill in to spread their advertising links or perform XSS attacks. To bypass most validation, the crawlers look for labeled form fields to fill in. Quite simply, HTML 5 forms will make this job easier.

Instead of labelling forms with “e-mail”, there’s now a specific input type <input type=”email”> which validate an e-mail address. Common anti-spam methods of adding a second e-mail field hidden to normal users will be ignored as there is a clear (and CSS visible) e-mail address field.

Forms validation may be useful for the normal user, but it’s even more useful for the spammer. With limits of input fields now being contained in plain text in the input, it makes it trivial for bots to enter correct data.

So, what can be done about this? Well, I’m not sure. There are some anti-spam methods that will still work, for instance timing the entrance to the page and seeing how long it took to complete the form. Very short times are spam, short times are sent for moderation and normal times are approved. There’s captcha, which is inaccessible and then there’s blacklisting, which hasn’t worked for years.

If you have any theories, please share them here. If there’s a solution or something the working group can do to make spam more difficult rather than easier, it should get into the spec sooner, rather than later.