It’s a rare treat in the technology world that the industry giants voluntarily agree to a shared standard. Usually it’s a Blu-ray/HD DVD type scenario where players back different standards and consumers are left hoping they’ve chosen the option that will win out (somewhere, a Laserdisc owner is sobbing).
So it was encouraging to hear earlier this month when Google, Microsoft and Yahoo! announced the Schema.org initiative. The collaboration creates a uniform standard for HTML markup that is supported by the three major search engines. Using these standardized “schema,” websites can more clearly define content using semantic markup, and help it appear more clearly and prominently in search results.
What Is the Purpose of Semantic Markup?
In 2009, Google announced rich snippets: a way to present search results that contained specific kinds of information. Websites could use semantic markup — code hidden to users but visible to search engines — to describe what a given piece of content contained. For example, if a restaurant review site incorporated the right semantic markup, their search results could show not just the usual snippet — page title and a short excerpt — but information about rating, pricing and type of food:
Certainly that’s more useful information for the prospective diner, and a great way for a website to distinguish their relevant content. Rich snippets now exist for many kinds of data – products, people, businesses, recipes – and are a strong way to optimize content, particularly for local search.
Why Was Schema.org Necessary?
When content contains the world “Berlin” does it refer to the German capital or the town in Vermont? Musically, does “Berlin” refer to 1980s synthpop band or the Lou Reed album Berlin or songwriter Irving Berlin? Search engines have gotten quite good at guessing meaning, but the emergence of semantic markup shows the additional benefit of user-provided context.
Until now, there have been three main options for this kind of semantic markup: RDFa, microformats, and the more recent HTML5 microdata. Each have their strengths and weaknesses, but for Schema.org, the search engines opted to throw their weight behind the microdata format, which they felt balanced simplicity with flexibility. Now, there are semantic data standards for everything from website elements to live music events to volcanoes. (Seriously, volcanoes.)
The good news is that RDFa and microformats will continue to be supported for those who have implemented them, so they’re not stuck with the HD DVDs of the semantic web. But while the code won’t be invalid, it’s imaginable that search engines will eventually give weight to websites that use their preferred format. The better news is that webmasters don’t have to weigh the virtues of microformats vs. microdata vs. RDFa, but can instead focus on the best implementation of the chosen markup for their content.
How It Works
Take this example from the Schema.org documentation for Restaurant markup. Here’s the original content for a generic restaurant called GreatFood:
GreatFood 4 stars - based on 250 reviews 1901 Lemur Ave Sunnyvale, CA 94086 (408) 714-1489 <a href="http://www.greatfood.com">www.greatfood.com</a> Hours: Mon-Sat 11am - 2:30pm Mon-Thur 5pm - 9:30pm Fri-Sat 5pm - 10pm Categories: Middle Eastern, Mediterranean Price Range: $$ Takes Reservations: Yes
The relevant information is all there, but the search engines are left trying to guess what it means. Now look at that same content marked up with microdata:
<div itemscope itemtype="http://schema.org/Restaurant"> <span itemprop="name">GreatFood</span> <div itemprop="aggregateRating" itemscope itemtype="http://schema.org/AggregateRating"> <span itemprop="ratingValue">4</span> stars - based on <span itemprop="reviewCount">250</span> reviews </div> <div itemprop="address" itemscope itemtype="http://schema.org/PostalAddress"> <span itemprop="streetAddress">1901 Lemur Ave</span> <span itemprop="addressLocality">Sunnyvale</span>, <span itemprop="addressRegion">CA</span> <span itemprop="postalCode">94086</span> </div> <span itemprop="telephone">(408) 714-1489</span> <a itemprop="url" href="http://www.dishdash.com">www.greatfood.com</a> Hours: <time itemprop="openingHours" datetime="Mo-Sa 11:00-14:30">Mon-Sat 11am - 2:30pm</time> <time itemprop="openingHours" datetime="Mo-Th 17:00-21:30">Mon-Thu 5pm - 9:30pm</time> <time itemprop="openingHours" datetime="Fr-Sa 17:00-22:00">Fri-Sat 5pm - 10:00pm</time> Categories: <span itemprop="servesCuisine"> Middle Eastern </span>, <span itemprop="servesCuisine"> Mediterranean </span> Price Range: <span itemprop="priceRange">$$</span> Takes Reservations: Yes </div>
For non-programmers that may seem a little confusing, but if you look at the code you can see the descriptive HTML code that prefaces each bit of information. The telephone number is tagged with itemprop=”telephone”. The cuisine type is tagged with itemprop=”servesCuisine”. Each tag tells the search engines explicitly “this is what this content should mean to you.” So the next time a user is searching for a Middle Eastern restaurant in that area, the search engines will have a higher confidence that they are providing relevant information when they list GreatFood in the results.
What Does Schema.org Mean for Me?
The standards laid out with Schema.org are brand spankin’ new. There are a lot of custom schema, many as specific as “Tire Shop,” “TV Episode,” “HTML Table,” and “Taxi Stand” (and that’s just in the T’s). Certainly anything currently supported in rich snippets – business and individual information, reviews, product details – has a clear benefit to be put into practice.
But this is just the start. Currently, some related schema are identical and different in name only. But as time goes on, those schema will expand to suit the given industry, product, or concept, so early adopters may see the benefits of identifying themselves on the semantic web. With the three big players collaborating on a uniform standard, it should be a clarion call to those who have yet to implement this type of on-site SEO: now is the time to tell search engines what your content means.