XML Sitemap Optimization Hacks I Learned from My 5-Year-Old
My kids love to create games; the more complex the better. One game my son invented this summer was an intricate, treasure hunt set up in my childhood home. The best part was it included all sorts of details on where to find the next clue. It was foolproof.
And that’s the key to XML sitemaps. The details matter. The fluff does not.
It’s simple. For an XML sitemap to be effective, include the information you want Google to index. Here’s an outline of the most important things to consider:
Considerations for XML Sitemap Optimization
How am I going to organize the map?
My son mapped out nearly an entire two-story house. Virtually, every room could be used in his treasure hunt. To provide some perspective, he had a master map that identified where rooms were by numbers and colors. Then, each room had its own map with a matching number and color. If you solved the puzzle in the room, you found the map to the next room. And the game went on. There was no getting lost with this kind of organization. Areas of the house that didn’t matter were marked with a black “X”.
Start with a well-organized sitemap index page
Sitemaps are most effective when large chunks of data are broken out into separate child sitemaps and linked to from the sitemap index. Create child sitemaps for things like images and videos. Pages that are not valuable to search engines because they are mainly navigational or potentially viewed as duplicate content can be ignored. This includes things like categories, tags, author pages and dated archives. I’ve also seen landing pages made strictly for paid search show up in the sitemap index.
This will vary greatly depending on your business. For example, a real estate broker may not want to include all the images for every property listing on their site. Real estate listings often expire, so the images become outdated quickly.
Remember: If it’s in your master map, it’s important to the treasure hunter or Google!
How often do you make changes? What’s most important?
In my son’s treasure hunt game, most of the content – his treasure maps – did not change. There were a couple of exceptions, but once complete, it was done. Some pieces of the map were more important than others. There was even one higher-priority map that was a shortcut, allowing the treasure hunter to skip steps in the game if they could decode it!
One element of my son’s game that posed a challenge was a pair of large drums in an upstairs living room that were constantly being moved. He didn’t know how to identify this in his map. His solution ended up being a pair of swirly arrows that signaled that the drums could be found anywhere in the room. It was his way to deal with something that was going to constantly update.
As mentioned before, your XML sitemap needs to match your site. Setting the priority rules and change frequency rules helps, but the real key is making sure that your sitemap is updated when changes are made, including when new content is added, pages are removed, or site structure changes.
Your sitemap and robots rules can be used as blueprints
My son took painstaking details to mark out key features of each room in his master blueprint of the house. The location of the most important furniture; where the windows were located; every vital element of each room was shown so the map would be useful and easily understood. And each child map matched the master.
Your XML sitemap needs to match what Google will find when it crawls your website. For example, category pages that are included in the XML sitemap, but are disallowed in the robots.txt file, or they’re tagged with a no-index robots meta tag. Remember, pages that search engines are instructed to ignore can be excluded from the XML sitemap.Including unnecessary pages in your XML sitemap is like Apple Maps giving you bad directions. Click To Tweet
BONUS: How to properly use disallow in robots.txt and meta robots no-index, follow
There is a big difference in how Google interprets disallow statements in the robots file versus a meta robots no-index, follow tag. Disallow instructs search engines to ignore the URL completely; no link juice flows through pages that are disallowed. On the other hand, search engines do not index pages that use a meta robots no-index, follow tag, they can still access the links on those pages, so the link juice flows through those pages.
For pages that are navigational like category and archive pages, it’s best to use a meta robots no-index, follow on them to allow any link juice to flow through those pages to the deeper pages of your site!
Generating an XML Sitemap is Easy
If you operate a large, database driven website, your content management system most likely has an automated XML Sitemap extension, module or plugin. If you operate a smaller, static site, you may not have an automated tool to manage this. There are many free XML Sitemap generators available.
One good option is XML-Sitemaps.
Another popular solution is to use a tool like Screaming Frog to generate your XML sitemap based on a crawl of your website.
Most websites have some content that changes infrequently and other content that changes very frequently. And that’s what the priority and change frequency options in your XML sitemap are for.
Use changefreq & priority properly
The pages of your website are not all “high priority”. Nor do they change hourly or daily. Frequently, I see sites that have the priority of every page of their website set to 1 (the highest priority) and the change frequency set to hourly or daily.
Change frequency tells search engines how frequently a page changes. The update refers to changes to the actual HTML content, not updated images or video.
- Yearly: Contact page, about us types of pages
- Monthly: FAQs, instructional content, articles that get updated on a regular basis
- Weekly: Product info pages, service pages, location pages)
- Daily: Blog index, classifieds, etc.
- Hourly: Real estate city page, major news site, weather information, forums
- Always: Stock market data, real-time reporting page
- Never: Outdated news, press releases, etc.
Priority tag guidelines
The priority tag uses a scale from 0.0 to 1.0. The higher the value, the higher priority the page is. Use the following guidelines for setting the priority of your pages:
- 0-0.3: Outdated news, info that is no longer relevant, but historically useful to the business
- 4-0.7: Articles, blog entries, category pages, FAQs
- 8-1.0: Homepage, product pages, service pages, major site features
Use Tools to Validate your Map
My son asked adults to make sure his map was correct. He understood that his maps had to be accurate for the game to be a success.
You can validate your XML sitemap using tools like the SEOChat XML Sitemap Validator. Additionally, you should submit your Sitemap to Google Search Console.
Information you can learn about your sitemap in Search Console:
- When your sitemap index file was submitted
- Sitemaps listed in the index file including the number of webpages, images and other content that has been submitted and indexed by Google
- Sitemap errors including 404 errors and URLs blocked by robots.txt
Don’t forget to test your XML Sitemap in Google prior to submitting it!
To test an unsubmitted sitemap in Google Search Console:
1. Click Add/Test Sitemap on the report landing page.
2. Enter the URL of the sitemap in the dialog that appears, and click Test.
3. When the test is completed, click Open Test Results to check for errors. Fix your errors.
4. After you fix your errors, click Submit Sitemap.
How Do Search Engines Find Your Sitemap?
When my son was finished with his treasure maps, he simply handed me the master map of the house and told me where to start.
The best way to introduce search engines to your sitemap is to include a link to it in your robots.txt file. This is called Sitemap Discovery, and it’s something that Google, Bing, Yahoo introduced back in 2007 to help their crawlers find XML sitemaps.
Think of your robots file as one-stop shopping for the rules to your website. The robots file should contain disallow statements for those URLs that Google can safely ignore as well as a link to your XML sitemap that includes the URLs Google should crawl and index.
To use this feature, just include the full path to your XML sitemap index file.
With some thorough planning, you can implement an XML sitemap that will lead Google to all the treasure – your most valuable content – on your website and avoid the fluff.
Tags: xml sitemap
The Weekly Measure: Nascar Storytelling, PPC Showdown & Mobile-First
Jul 20, 2018
5 Google AdWords Features You Should Be Using In 2018
Jul 17, 2018
The Weekly Measure: Dropping Followers, Converting On-Brand Content & Finding Balance
Jul 13, 2018
HTTP vs. HTTPS & Why It’s Time to Go Secure
Jul 12, 2018