17 Oct 2011

HTML5 and SEO: New Strategies for Optimizing Code


The evolution of HTML has been a long transition away from how things look and toward what things mean. Gone are the days when web pages were cobbled together in HTML tables to compensate for limited layout controls. (Or, rather, those days should be gone. You know who you are.) Improvements in the language and the rise of CSS allowed web programmers to build pages that looked great for users and made better sense to search engines.

But even with the progress in semantic markup – coding to describe what content conveys, not what it looks like – there has still been the necessary evil of <div> and <span> tags. These semantically meaningless elements continued to be used as a fallback for structure and design when semantic tags fell short. While it was closer to the ideal than tables, it still was programming crutch to fill a shortcoming in the HTML language, and another bit of meaningless code for search engines to work through to get to the stuff with purpose.

About HTML5

With the advent of HTML5, many of those shortcomings have been addressed. Ubiquitous workarounds like <div id=”header”> and <div id=”footer”> have now been given their own formal tags, <header> and <footer>. This allows web content to be organized in a way that’s sensible, and may influence how search engines view, understand and rank content.

Here’s a common example of how a page relying on <div> tags would be structured:

Website Code Unoptimized for HTML5

It gets the job done, but identifiers like “section” or “sidebar” don’t mean anything to search engines. And as naming conventions vary from programmer to programmer, a content block serving as a sidebar might be called “sidebar” or “side” or “secondary” or something else entirely. But now let’s look at that same page structure using the new HTML5 tags:

Website Code Optimized for HTML5

The content is organized the same, but each element now has a meaning that’s explicitly clear. No longer forced to guess between variously named <div> tags, a search engine can more easily and consistently gauge the content on your page for importance and relevance.

New Structural Tags

There are a number of new and depricated HTML5 tags being incorporated into this (still developing) standard. I’ve previously discussed how HTML5 microdata has been adopted by schema.org for local optimization and other rich snippets, but here are some of the more relevant structural developments I used above:


Almost every site has some flavor of <div id=”header”> that contains the site logo, site-wide branding and introductory content. The new <header> tag replaces this workaround. More importantly, It can also be used additional times on a page to server as introductory areas for <section> tags. These types of <header> tags typically contain headings  (<h1>, <h2>, etc.), and the purpose is similar – to briefly summarize and introduce subsequent content.

However, <header> tags can also contain expanded content like paragraph tags and optimized link; while these likely won’t get as much SEO weight as a heading tag, it affords more options beyond what you can cram into an <h2>.


Typically near or nested within the page’s main <header> tag, the <nav> tag serves to identify a collection of links on a page. This can be of particular value for the main site navigation to make website structure more apparent, and identify other key sets of links.

Other areas to use the <nav> tag are secondary navigation links (e.g. related articles), pagination links, and breadcrumb navigation links. By identifying these main navigation areas, bots crawling your site will have a clearer path through your content.


The <article> tag is intended to encapsulate a self-contained piece of content. This could be a blog post, informational page or discussion forum entry.  The idea is that anything in an <article> tag can stand on its own as a piece of content (for example, if you saw it alone in an RSS feed). Text and images within an <article> is more clearly identified as a primary content, and likely be weighted as more significant to a page than something tucked in an <aside> or <footer>.


The <section> tag is used to define related stretches of content. For example an <article> may be broken up into different <section> areas, each usually starting with a <header>, <h2> or similar lead-in. Unlike an <article>, a <section> is not a standalone piece of content.

One potential benefit to this organization is that it may allow search engines to consider each <section> more independently, rather as one long piece of content.  If one <section> mentions a given phrase multiple times, it is logical that a search engine might look at that content more authoritatively than if the phrase were used more sporadically across multiple <section> areas.


While the temptation is to assume that the <aside> tag equates to any sidebar type content, the intention is more specific. This element identifies content that is set aside from the primary content, but which relates to it. Examples would be pull-quotes, footnotes, source links or other content that is ancillary, but which relates directly to the main content. By being able to clearly identify important secondary content, even if it’s visually placed over in a sidebar, you can distinguish it from less-important content like blogroll links and general branding info.


While not as useful as the <header>, the footer tag can be used to identify closing or secondary information at the end of a <section> or page. Like the <header> tag, there can be multiple <footer> tags on a page. In addition to common footer links, a <footer> may contain authorship material for a blog post, copyright information or related links.

Should I Use HTML5 Now?

HTML5 is solidifying as a standard, and supported by all the major web browsers with some variation. It was only a year ago that Google was espousing a “it won’t help, but it can’t hurt” attitude toward HTML5. But in that year, the push toward HTML5 has continued unabated. As with the development of the schema.org standard, it’s clear that search engines like having a semantic standard that helps them improve how they understand and deliver content.

With that standard already in place, the sooner that sites adapt their content to HTML5, the better chance they have to stand out from the sea of indistinct <div> tags — and to be ahead of the curve as search algorithms increasingly weight clean, semantic code.

What are your thoughts on moving toward the HTML5 standard, and potential SEO benefits? Let us know in the comments!


  • FD Thomson Oct 18, 2011

    Well David Gould! I came to know very useful information about HTML5 by reading this post. Thanks for sharing this post.

  • Brent Nau Oct 19, 2011

    Thanks David. This was a nice introductory into HTML5. I am just starting to convert some of my own websites into HTML5.

  • happysus Sait Oct 19, 2011

    I don’t understand how html5 could improve my site seo. I think It increase site speed.

  • David Gould Oct 19, 2011

    Thanks FD! And thanks Brent, I think it’ll be a good step to be ahead of the curve as ranking factors evolve.

  • David Gould Oct 19, 2011


    It probably will have minimal impact on site speed, at least regarding the tags I discussed. Whether it’s an article or a div tag, it should take a comparable amount of time to parse and render the same content and styling.

    However, the SEO advantages are primarily that you’re explicitly indicating to search engines which of your content is more or less relevant. While it’s early in the process to see how Google and others will apply this to their algorithms, such a clear indication of content priority should become a notable factor, especially given the support the search engines have put into HTML5.


  • josh bachynski - martial arts seo guy Oct 19, 2011

    I have read that for the rel=canonical or rel=author LINK tags to work that they need to be implemented in an HTML5 page, that is to say one with an HTML 5 DOCTYPE – can anyone confirm this?

    If so, there is a reason to go to HTML 5 right now.

  • Dominique Peladeau Oct 20, 2011

    means something in English, but not in French. I prefer the div id tag system because the id can also serve for seo purpose (who knows what G does with the values contained in the id tags…) Also, since I am french, and since I use french in the id tags, I avoid Javascript or html reserved words.

  • David Gould Oct 20, 2011

    Hi Josh,

    An HTML5 tag will work in a browser that supports it, whether or not the Doctype is HTML5. Unless the Doctype is an old quirks mode (which it really shouldn’t be), HTML5 and most other HTML4 Doctypes just trigger a browser’s standards mode and it will render the same either way. So an article tag will work in all modern browsers, whether or not you’re using the HTML5 Doctype. Likewise, a search engine will be able to use the rel=author attribute, whether or not it explicitly jells with the Doctype.

    Now, that is different from the code validating. rel=author is an HTML5 standard, so it won’t fully validate if your Doctype is HTML4, but it should still render and be utilized by search engines properly.

    How important validation is to you is your call. All things being equal, obviously validated code is preferred. But given that rel=canonical is not a standard (at least yet), and won’t validate even with an HTML5 Doctype, it’s a case of usefulness outweighing superficially perfect validation.

    Hope that helps!

  • David Gould Oct 20, 2011

    Salut Dominique,

    It will still matter in French and other non-English languages. The benefit to the new HTML5 structural tags is not the tag words themselves (there is no keyword benefit to either header or div=header), but rather what those tags tell a search engine. A div tag doesn’t give search engines any clue as to how important its content is. These new HTML5 tags do.

    The idea is that whatever you nest in an article tag, for example, is likely more relevant to that page than something in the footer tag. That will have a likely SEO benefit, whatever language the content itself is in.

    And you can still use id attributes for the new tags as well, like you do currently with divs.

    Merci pour vos pensées!

  • Dominique Peladeau Oct 20, 2011

    Hello David,

    I still prefer the div tag. Very often I use css in such a way that I present to G or B the main content of my web page first, not the header, not the left or right side, but the article itself. I am talking source code here. I do it that way because the css styles allow me to present the web page code in a certain way to the robots while the web page renders in another order, i.e. users can see the web page like it should: header first, left nav or sidebar, etc. This way, G or B or whatever robot is indexing the web page gets to read the fresh content (article) first, not the header or the navigation (wish can be very lengthy in some cases). I do not know if this would still be the case with the new HTML5 tags.
    Merci pour votre blogue!

  • David Gould Oct 20, 2011

    You can style and lay out the HTML5 tags the exact same way you do currently with divs, you just get the added bonus of being able to convey additional semantic meaning to the search engines which can improve your SEO.

  • Dominique Peladeau Oct 20, 2011

    Hello David,

    Thank you for the precisions. Its always good when we can improve our SEO.
    Do you really think we should code for the search engines? I am thinking shema.org stuff here, which is similar to the semantic meaning the header and article tags should bring. I doubt that the added effort will give good ROI. In 2 or 3 years from now the search engines should start, I hope, to understand the natural relations between words, and all the shema.org stuff will be irrelevant. Also, I think black hat stuff will render the article, header, and all the other semantic tags useless, just like the meta tags got abused. I guess content is and will still be king.

    Have a nice day!

  • David Gould Oct 20, 2011

    It’s really minimal effort and a very solid ROI. It takes a couple minutes to code an address to schema.org standards, and you’ve immediately taken all the guesswork out of search engines recognizing you for local search. There will always be someone trying to game the system, but the ability for white hat SEO to give precise meaning to search engines is huge.

    Content is king, but good semantic markup is the crown on its head that lets everyone know.

  • Noy Oct 22, 2011

    Excellent post I need other options other than just relying on google. It’s nice to finally read some proper seo tactics. There’s too much contradicting info on the web. Very easy to follow well written article. Thanks very much.

  • Gyi Tsakalakis Nov 02, 2011

    This was one of the more clear examples of how HTML5 may have benefits for communicating with search engines in the future. Thanks. Do you have any recommendations for HTML5 references? I’ve been poking around online, but am looking for something comprehensive.

  • David Gould Nov 02, 2011

    Thanks Gyi.

    There are many articles out there addressing the subject in different ways. A good place to start is the W3Schools HTML5 reference — http://www.w3schools.com/html5/html5_reference.asp — which covers what’s new, deprecated, unsupported, as well as some nice examples of each tag.