HTML5 and SEO: New Strategies for Optimizing Code

October 17th, 2011 • By:  • Search Engine Optimization

HTML5 and SEO

The evolution of HTML has been a long transition away from how things look and toward what things mean. Gone are the days when web pages were cobbled together in HTML tables to compensate for limited layout controls. (Or, rather, those days should be gone. You know who you are.) Improvements in the language and the rise of CSS allowed web programmers to build pages that looked great for users and made better sense to search engines.

But even with the progress in semantic markup – coding to describe what content conveys, not what it looks like – there has still been the necessary evil of <div> and <span> tags. These semantically meaningless elements continued to be used as a fallback for structure and design when semantic tags fell short. While it was closer to the ideal than tables, it still was programming crutch to fill a shortcoming in the HTML language, and another bit of meaningless code for search engines to work through to get to the stuff with purpose.

About HTML5

With the advent of HTML5, many of those shortcomings have been addressed. Ubiquitous workarounds like <div id=”header”> and <div id=”footer”> have now been given their own formal tags, <header> and <footer>. This allows web content to be organized in a way that’s sensible, and may influence how search engines view, understand and rank content.

Here’s a common example of how a page relying on <div> tags would be structured:

Website Code Unoptimized for HTML5

It gets the job done, but identifiers like “section” or “sidebar” don’t mean anything to search engines. And as naming conventions vary from programmer to programmer, a content block serving as a sidebar might be called “sidebar” or “side” or “secondary” or something else entirely. But now let’s look at that same page structure using the new HTML5 tags:

Website Code Optimized for HTML5

The content is organized the same, but each element now has a meaning that’s explicitly clear. No longer forced to guess between variously named <div> tags, a search engine can more easily and consistently gauge the content on your page for importance and relevance.

New Structural Tags

There are a number of new and depricated HTML5 tags being incorporated into this (still developing) standard. I’ve previously discussed how HTML5 microdata has been adopted by schema.org for local optimization and other rich snippets, but here are some of the more relevant structural developments I used above:

<header>

Almost every site has some flavor of <div id=”header”> that contains the site logo, site-wide branding and introductory content. The new <header> tag replaces this workaround. More importantly, It can also be used additional times on a page to server as introductory areas for <section> tags. These types of <header> tags typically contain headings  (<h1>, <h2>, etc.), and the purpose is similar – to briefly summarize and introduce subsequent content.

However, <header> tags can also contain expanded content like paragraph tags and optimized link; while these likely won’t get as much SEO weight as a heading tag, it affords more options beyond what you can cram into an <h2>.

<nav>

Typically near or nested within the page’s main <header> tag, the <nav> tag serves to identify a collection of links on a page. This can be of particular value for the main site navigation to make website structure more apparent, and identify other key sets of links.

Other areas to use the <nav> tag are secondary navigation links (e.g. related articles), pagination links, and breadcrumb navigation links. By identifying these main navigation areas, bots crawling your site will have a clearer path through your content.

<article>

The <article> tag is intended to encapsulate a self-contained piece of content. This could be a blog post, informational page or discussion forum entry.  The idea is that anything in an <article> tag can stand on its own as a piece of content (for example, if you saw it alone in an RSS feed). Text and images within an <article> is more clearly identified as a primary content, and likely be weighted as more significant to a page than something tucked in an <aside> or <footer>.

<section>

The <section> tag is used to define related stretches of content. For example an <article> may be broken up into different <section> areas, each usually starting with a <header>, <h2> or similar lead-in. Unlike an <article>, a <section> is not a standalone piece of content.

One potential benefit to this organization is that it may allow search engines to consider each <section> more independently, rather as one long piece of content.  If one <section> mentions a given phrase multiple times, it is logical that a search engine might look at that content more authoritatively than if the phrase were used more sporadically across multiple <section> areas.

<aside>

While the temptation is to assume that the <aside> tag equates to any sidebar type content, the intention is more specific. This element identifies content that is set aside from the primary content, but which relates to it. Examples would be pull-quotes, footnotes, source links or other content that is ancillary, but which relates directly to the main content. By being able to clearly identify important secondary content, even if it’s visually placed over in a sidebar, you can distinguish it from less-important content like blogroll links and general branding info.

<footer>

While not as useful as the <header>, the footer tag can be used to identify closing or secondary information at the end of a <section> or page. Like the <header> tag, there can be multiple <footer> tags on a page. In addition to common footer links, a <footer> may contain authorship material for a blog post, copyright information or related links.

Should I Use HTML5 Now?

HTML5 is solidifying as a standard, and supported by all the major web browsers with some variation. It was only a year ago that Google was espousing a “it won’t help, but it can’t hurt” attitude toward HTML5. But in that year, the push toward HTML5 has continued unabated. As with the development of the schema.org standard, it’s clear that search engines like having a semantic standard that helps them improve how they understand and deliver content.

With that standard already in place, the sooner that sites adapt their content to HTML5, the better chance they have to stand out from the sea of indistinct <div> tags — and to be ahead of the curve as search algorithms increasingly weight clean, semantic code.

What are your thoughts on moving toward the HTML5 standard, and potential SEO benefits? Let us know in the comments!

David Gould

David is the Creative Services Director at Vertical Measures. He oversees written, visual and video content production, as well as social media and promotion. David works with clients to develop compelling content that pulls together business goals with customer needs. His 15 years experience in writing, design and web development have provided a perfect complement of skills for effective content marketing and strategy.

+David Gould

More PostsWebsiteTwitterFacebook

This entry was posted on Monday, October 17th, 2011 at 11:46 am and is filed under Search Engine Optimization. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

17 Responses to “HTML5 and SEO: New Strategies for Optimizing Code”

  1. FD Thomson Says:

    Well David Gould! I came to know very useful information about HTML5 by reading this post. Thanks for sharing this post.

  2. Brent Nau Says:

    Thanks David. This was a nice introductory into HTML5. I am just starting to convert some of my own websites into HTML5.

  3. happysus Sait Says:

    I don’t understand how html5 could improve my site seo. I think It increase site speed.

  4. David Gould Says:

    Thanks FD! And thanks Brent, I think it’ll be a good step to be ahead of the curve as ranking factors evolve.

  5. David Gould Says:

    Sait,

    It probably will have minimal impact on site speed, at least regarding the tags I discussed. Whether it’s an article or a div tag, it should take a comparable amount of time to parse and render the same content and styling.

    However, the SEO advantages are primarily that you’re explicitly indicating to search engines which of your content is more or less relevant. While it’s early in the process to see how Google and others will apply this to their algorithms, such a clear indication of content priority should become a notable factor, especially given the support the search engines have put into HTML5.

    Thanks!

  6. josh bachynski - martial arts seo guy Says:

    I have read that for the rel=canonical or rel=author LINK tags to work that they need to be implemented in an HTML5 page, that is to say one with an HTML 5 DOCTYPE – can anyone confirm this?

    If so, there is a reason to go to HTML 5 right now.

  7. Dominique Peladeau Says:

    means something in English, but not in French. I prefer the div id tag system because the id can also serve for seo purpose (who knows what G does with the values contained in the id tags…) Also, since I am french, and since I use french in the id tags, I avoid Javascript or html reserved words.

  8. David Gould Says:

    Hi Josh,

    An HTML5 tag will work in a browser that supports it, whether or not the Doctype is HTML5. Unless the Doctype is an old quirks mode (which it really shouldn’t be), HTML5 and most other HTML4 Doctypes just trigger a browser’s standards mode and it will render the same either way. So an article tag will work in all modern browsers, whether or not you’re using the HTML5 Doctype. Likewise, a search engine will be able to use the rel=author attribute, whether or not it explicitly jells with the Doctype.

    Now, that is different from the code validating. rel=author is an HTML5 standard, so it won’t fully validate if your Doctype is HTML4, but it should still render and be utilized by search engines properly.

    How important validation is to you is your call. All things being equal, obviously validated code is preferred. But given that rel=canonical is not a standard (at least yet), and won’t validate even with an HTML5 Doctype, it’s a case of usefulness outweighing superficially perfect validation.

    Hope that helps!

  9. David Gould Says:

    Salut Dominique,

    It will still matter in French and other non-English languages. The benefit to the new HTML5 structural tags is not the tag words themselves (there is no keyword benefit to either header or div=header), but rather what those tags tell a search engine. A div tag doesn’t give search engines any clue as to how important its content is. These new HTML5 tags do.

    The idea is that whatever you nest in an article tag, for example, is likely more relevant to that page than something in the footer tag. That will have a likely SEO benefit, whatever language the content itself is in.

    And you can still use id attributes for the new tags as well, like you do currently with divs.

    Merci pour vos pensées!

  10. Dominique Peladeau Says:

    Hello David,

    I still prefer the div tag. Very often I use css in such a way that I present to G or B the main content of my web page first, not the header, not the left or right side, but the article itself. I am talking source code here. I do it that way because the css styles allow me to present the web page code in a certain way to the robots while the web page renders in another order, i.e. users can see the web page like it should: header first, left nav or sidebar, etc. This way, G or B or whatever robot is indexing the web page gets to read the fresh content (article) first, not the header or the navigation (wish can be very lengthy in some cases). I do not know if this would still be the case with the new HTML5 tags.
    Merci pour votre blogue!

  11. David Gould Says:

    You can style and lay out the HTML5 tags the exact same way you do currently with divs, you just get the added bonus of being able to convey additional semantic meaning to the search engines which can improve your SEO.

  12. Dominique Peladeau Says:

    Hello David,

    Thank you for the precisions. Its always good when we can improve our SEO.
    Do you really think we should code for the search engines? I am thinking shema.org stuff here, which is similar to the semantic meaning the header and article tags should bring. I doubt that the added effort will give good ROI. In 2 or 3 years from now the search engines should start, I hope, to understand the natural relations between words, and all the shema.org stuff will be irrelevant. Also, I think black hat stuff will render the article, header, and all the other semantic tags useless, just like the meta tags got abused. I guess content is and will still be king.

    Have a nice day!

  13. David Gould Says:

    It’s really minimal effort and a very solid ROI. It takes a couple minutes to code an address to schema.org standards, and you’ve immediately taken all the guesswork out of search engines recognizing you for local search. There will always be someone trying to game the system, but the ability for white hat SEO to give precise meaning to search engines is huge.

    Content is king, but good semantic markup is the crown on its head that lets everyone know.

  14. SEO Recap: Amazing Finds On Twitter This Week | Search Engine Journal Says:

    [...] HTML5 and SEO: New Strategies for Optimizing Code – Vertical [...]

  15. Noy Says:

    Excellent post I need other options other than just relying on google. It’s nice to finally read some proper seo tactics. There’s too much contradicting info on the web. Very easy to follow well written article. Thanks very much.

  16. Gyi Tsakalakis Says:

    This was one of the more clear examples of how HTML5 may have benefits for communicating with search engines in the future. Thanks. Do you have any recommendations for HTML5 references? I’ve been poking around online, but am looking for something comprehensive.

  17. David Gould Says:

    Thanks Gyi.

    There are many articles out there addressing the subject in different ways. A good place to start is the W3Schools HTML5 reference — http://www.w3schools.com/html5/html5_reference.asp — which covers what’s new, deprecated, unsupported, as well as some nice examples of each tag.

Leave a Reply