The evolution of HTML has been a long transition away from how things look and toward what things mean. Gone are the days when web pages were cobbled together in HTML tables to compensate for limited layout controls. (Or, rather, those days should be gone. You know who you are.) Improvements in the language and the rise of CSS allowed web programmers to build pages that looked great for users and made better sense to search engines.
But even with the progress in semantic markup – coding to describe what content conveys, not what it looks like – there has still been the necessary evil of <div> and <span> tags. These semantically meaningless elements continued to be used as a fallback for structure and design when semantic tags fell short. While it was closer to the ideal than tables, it still was programming crutch to fill a shortcoming in the HTML language, and another bit of meaningless code for search engines to work through to get to the stuff with purpose.
With the advent of HTML5, many of those shortcomings have been addressed. Ubiquitous workarounds like <div id=”header”> and <div id=”footer”> have now been given their own formal tags, <header> and <footer>. This allows web content to be organized in a way that’s sensible, and may influence how search engines view, understand and rank content.
Here’s a common example of how a page relying on <div> tags would be structured:
It gets the job done, but identifiers like “section” or “sidebar” don’t mean anything to search engines. And as naming conventions vary from programmer to programmer, a content block serving as a sidebar might be called “sidebar” or “side” or “secondary” or something else entirely. But now let’s look at that same page structure using the new HTML5 tags:
The content is organized the same, but each element now has a meaning that’s explicitly clear. No longer forced to guess between variously named <div> tags, a search engine can more easily and consistently gauge the content on your page for importance and relevance.
New Structural Tags
There are a number of new and depricated HTML5 tags being incorporated into this (still developing) standard. I’ve previously discussed how HTML5 microdata has been adopted by schema.org for local optimization and other rich snippets, but here are some of the more relevant structural developments I used above:
Almost every site has some flavor of <div id=”header”> that contains the site logo, site-wide branding and introductory content. The new <header> tag replaces this workaround. More importantly, It can also be used additional times on a page to server as introductory areas for <section> tags. These types of <header> tags typically contain headings (<h1>, <h2>, etc.), and the purpose is similar – to briefly summarize and introduce subsequent content.
However, <header> tags can also contain expanded content like paragraph tags and optimized link; while these likely won’t get as much SEO weight as a heading tag, it affords more options beyond what you can cram into an <h2>.
Typically near or nested within the page’s main <header> tag, the <nav> tag serves to identify a collection of links on a page. This can be of particular value for the main site navigation to make website structure more apparent, and identify other key sets of links.
Other areas to use the <nav> tag are secondary navigation links (e.g. related articles), pagination links, and breadcrumb navigation links. By identifying these main navigation areas, bots crawling your site will have a clearer path through your content.
The <article> tag is intended to encapsulate a self-contained piece of content. This could be a blog post, informational page or discussion forum entry. The idea is that anything in an <article> tag can stand on its own as a piece of content (for example, if you saw it alone in an RSS feed). Text and images within an <article> is more clearly identified as a primary content, and likely be weighted as more significant to a page than something tucked in an <aside> or <footer>.
The <section> tag is used to define related stretches of content. For example an <article> may be broken up into different <section> areas, each usually starting with a <header>, <h2> or similar lead-in. Unlike an <article>, a <section> is not a standalone piece of content.
One potential benefit to this organization is that it may allow search engines to consider each <section> more independently, rather as one long piece of content. If one <section> mentions a given phrase multiple times, it is logical that a search engine might look at that content more authoritatively than if the phrase were used more sporadically across multiple <section> areas.
While the temptation is to assume that the <aside> tag equates to any sidebar type content, the intention is more specific. This element identifies content that is set aside from the primary content, but which relates to it. Examples would be pull-quotes, footnotes, source links or other content that is ancillary, but which relates directly to the main content. By being able to clearly identify important secondary content, even if it’s visually placed over in a sidebar, you can distinguish it from less-important content like blogroll links and general branding info.
While not as useful as the <header>, the footer tag can be used to identify closing or secondary information at the end of a <section> or page. Like the <header> tag, there can be multiple <footer> tags on a page. In addition to common footer links, a <footer> may contain authorship material for a blog post, copyright information or related links.
Should I Use HTML5 Now?
HTML5 is solidifying as a standard, and supported by all the major web browsers with some variation. It was only a year ago that Google was espousing a “it won’t help, but it can’t hurt” attitude toward HTML5. But in that year, the push toward HTML5 has continued unabated. As with the development of the schema.org standard, it’s clear that search engines like having a semantic standard that helps them improve how they understand and deliver content.
With that standard already in place, the sooner that sites adapt their content to HTML5, the better chance they have to stand out from the sea of indistinct <div> tags — and to be ahead of the curve as search algorithms increasingly weight clean, semantic code.
This entry was posted on Monday, October 17th, 2011 at 11:46 am and is filed under Search Engine Optimization. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.