What kind of maps do spiders make?
...
Web-based maps.
Ba dumm, tsss!
I know, hilarious.
But that’s exactly what a sitemap is: a web-based map to your website that search engine spiders use to crawl pages faster and more effectively.
The only difference is that you make this map, and the itsy bitsy spiders of the internet navigate it to find every dark corner of your website.
In this article, we’re going to explore everything you need to know about creating and optimizing your SEO sitemap to ensure search engines crawl more, index more, and rank more.
Are you ready?
Let’s go!
Get brand new SEO strategies straight to your inbox every week. 23,739 people already are!Sign Me Up
A sitemap is a file that lists your website’s URLs and provides specific information to search engines about web pages, media and other files that exist on your website. Like the name suggests, it’s a literal map to your website.
Your sitemap is like a secondary discovery method for web crawlers: even though search engine spiders don’t always need a sitemap, having one ensures they don’t miss anything.
What’s a sitemap look like?
Here’s an example of our sitemap:
This is a sitemap, however, it’s dressed up with CSS so it looks neat and organized.
If we examine the HTML underneath, this is what sitemap syntax actually looks like:
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://www.klientboost.com/</loc>
<lastmod>2021-10-01</lastmod>
<changefreq>monthly</changefreq>
<priority>0.8</priority>
</url>
<url>
<loc>https://www.klientboost.com/ppc-agency</loc>
<lastmod>2021-10-01</lastmod>
<changefreq>monthly</changefreq>
<priority>0.8</priority>
</url>
<url>
<loc>https://www.klientboost.com/seo-agency</loc>
<lastmod>2021-10-01</lastmod>
<changefreq>monthly</changefreq>
<priority>0.8</priority>
</url>
</urlset>
And so on, repeating a new section for every URL.
Google crawls and indexes the web largely by following internal links (links to other pages within the same domain) and external links (links to other domains).
Sitemaps help Google crawl and index your website more efficiently by providing them with explicit instructions on what pages to crawl and prioritize, along with additional information about videos, images, and news entries.
The result? Search engines can rank more pages more often and better display information along with those rankings.
What type of additional information can you specify other than URLs?
- Page titles
- Date published
- Last modified
- How frequently a page is likely to change
- Priority of pages (though Google ignores these)
- Alternate mobile URLs
- Alternate language URLs
- Video titles, descriptions, duration, expiration date, rating, price, family friendly, restrictions, requires subscription,
- Image tiles, type, license, location, captions
- News publication name and language, article title
Note: Search engines use sitemaps as recommendations. Just because you tell them which pages are priorities or which pages exist, it doesn’t mean they have to listen.
Google supports three sitemap formats:
- XML sitemap: XML stands for extensible markup language. It’s similar to HTML, but whereas HTML displays data, XML stores and transfers it. All you need to know is that an XML file sitemap is the status quo; it’s what you’ll use in almost every case (and what content management systems like Wix or Squarespace or WordPress plugins like Yoast use). Why? Because you can provide the most information with it.
- Text sitemap: If your sitemap only includes URLs (no extra information, images, or videos), you can use a basic text sitemap that lists URLs, line by line. Supported, but limited.
- RSS sitemap: RSS is a web feed for blogs that people can subscribe to and get automatic updates via email. You can also submit your RSS feed link to Google and they’ll read it as a sitemap. Like text sitemaps, RSS sitemaps only include URLs.
Of those three formats, they support three additional extensions:
- Video sitemaps: Use this for videos that are time-sensitive or hard to find on your website.
- Image sitemaps: If you want to have more control over the information Google displays about your images in search.
- News sitemaps: Use when you publish time-sensitive news or want articles featured in Google News.
In most cases, you won’t need a separate image, news, or video sitemap since images and videos will appear within pages that appear within your normal sitemap, and most websites aren’t news outlets.
However, if your website hosts tons of videos and images, or if you’re a news publisher, you can create a sitemap extension for each.
Whereas an XML sitemap is a file you would submit to Google or Bing, an HTML sitemap refers to an expanded list of links that lives on your website mainly to help visitors navigate better.
For example, our HTML sitemap lives within the footer of our website and appears on every page (though some websites place an HTML sitemap on its own page).
Does Google support HTML sitemaps?
You can’t submit an HTML sitemap through Google Search Console. So in that respect, no. However, Google’s algorithm crawls links, and HTML sitemaps are chock full of them. So including one in your footer can help ensure Google finds the most important pages on your website.
Google recommends submitting a sitemap in four scenarios:
- Large websites: If you have a website with thousands of pages, best to submit a sitemap to help search engine crawlers find and index every page.
- Isolated pages: Hopefully, most of the pages on your website are easy to find by visitors and search engines. However, in some cases, you may not want pages to be found easily by any visitor, only those searching for specific information. In that case, best to submit a sitemap so Google knows those pages exist.
- New website: New sites are more difficult to crawl and index often because few sites (if any) link to them yet. If you just published a website, always submit a sitemap to Google to ensure each page gets indexed fast.
- Sites with rich media or news: Sitemaps let you send additional information to Google that they can use to better rank media and better display media within SERPs (search engine results page). If you’re a media site with news or loads of rich media like images and videos, or you're an ecommerce website with product photos attached to every product, best to submit a sitemap.
Since sitemaps are so easy to create and submit, we would never suggest you don’t do it.
But the truth remains: for the average website with a few dozen pages, standard media, and a user-friendly site structure, Google should be able to crawl and index all your pages without help.
You can auto-create a dynamic XML sitemap (recommended) so that whenever your website is updated with new pages, your sitemap gets updated too. Or you can manually create a static sitemap using a free sitemap generator.
If you built your website on a content management system (CMS) like SquareSpace, WIx, Shopify, or BigCommerce, they’ll auto-create an XML sitemap for you. Just don’t forget to submit it to Google.
If you built your website on WordPress, use an XML sitemap generator plugin like Yoast SEO. They’ll auto-create an XML sitemap for you too. Yoast also has functionality for image, video, and news sitemap extensions.
Or you can use a free tool like XML Sitemap Generator. Just add your URL and they’ll create an XML sitemap for you. The only downside to manually creating a sitemap is that every time you update your site, you’ll have to manually create an updated sitemap.
Sitemaps are easy to create, especially considering how many tools exist that will do it for free. But there are a handful of best practices Google recommends you follow.
This should go without saying, but don’t use inconsistent URL formats.
For example, use absolute URLs like https://klientboost.com/category/seo/, don’t use relative URLs like /category/seo. And keep the syntax the same across all URLS. i.e. don’t use https://www.klientboost.com/category/seo/ (with “www”) sometimes and other times without the “www.” Or don’t use http:// when your website has been migrated to https://.
Google allows sitemaps up to 50MB and 500K URLs. Anything bigger should be split into multiple sitemaps.
500K? I know, HUGE.
Most websites will never get that big.
But here’s a pro tip: split your sitemap into multiple parts anyways.
According to Google’s John Mueller: “There is no ideal size (other than that it should be below the maximum size limits :-)). I generally recommend splitting a sitemap file into logical parts of your site so that you can monitor those parts individually (eg, category pages vs detail pages vs whatever else you have).”
If you split your sitemap into logical parts, when you pull up your sitemap report inside Search Console, you can easily see which part of your site is not performing the way it’s supposed to.
How do you create multiple sitemaps?
Create a sitemap index file (it’s like a sitemap for your sitemap). Your sitemap index file is one sitemap that links to multiple sitemaps.
For example, if you visit klientboost.com/sitemap.xml, you’ll see our sitemap index file:
If you check the source code, this is what our sitemap index file HTML looks like without CSS.
<?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet type="text/xsl" href="sitemap.xsl"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap>
<loc>https://klientboost.com/sitemap-posts.xml</loc>
<lastmod>2021-10-26T16:58:34.000Z</lastmod>
</sitemap>
<sitemap>
<loc>https://klientboost.com/sitemap-pages.xml</loc>
<lastmod>2021-10-26T20:02:55.782Z</lastmod>
</sitemap>
<sitemap>
<loc>https://klientboost.com/sitemap-categories.xml</loc>
<lastmod>2021-10-26T20:02:55.770Z</lastmod>
</sitemap>
<sitemap>
<loc>https://klientboost.com/sitemap-case-studies.xml</loc>
<lastmod>2021-10-25T17:31:30.000Z</lastmod>
</sitemap>
</sitemapindex>
One sitemap URL. Multiple sitemaps within it.
You can tell Google which URLs you prioritize, but they don’t need to listen. Plus, no matter where you place a URL (whether it’s first or last) doesn’t change the way search engines read them.
Google allows you to submit multiple language/region versions of a webpage within the same file using the hreflang attribute. For instructions, check Google’s instructions for specifying alternate languages.
Google actually recommends that if you have separate mobile and desktop URLs (e.g. m.example.com vs. example.com) you only specify one version.
However, you can still annotate your sitemap to include both if you’d like.
A canonical tag is an HTML tag that tells search engines which page is the original in the event that multiple instances of the same page exist on different URLs.
For example, it’s not uncommon for the same eCommerce product to appear on multiple URLs depending on which filter the visitor used to find it.
Did they search by color? By size? By date? By price-range?
Depending on the CMS, all four of those filters may produce the same product, only on different URLs.
Let’s take a look: When searching a ecommerce site for power ranger action figures, you can find the green ranger on multiple URLs depending on how you used their product filter:
/action-figures/green-ranger/
/power-rangers/green-ranger/
/90s/green-ranger/
/product/green-ranger/ (Canonical)
To avoid Google thinking you have duplicate content (or multiple versions of the same page), you would use the rel=canonical tag on each page to specify the original page (/product/green-ranger/).
Then, within your sitemap, you should only include the canonical (original) URL (not the other versions).
You can place your sitemap anywhere on your website, but it can only reference descend URLs. For example, if we put our sitemap on www.klientboost.com/projects/sitemap.xml, it could only reference URLs within the /projects subfolder. This is why we recommend putting your sitemap on your root (e.g. klientboost.com/sitemap.xml).
Don’t know how to find your sitemap? The easiest way is to check your robots.txt file first.
Type in your URL followed by /robots.txt. For example, yourwebsite.com/robots.txt.
No use making search engines work to find your sitemap. That would be ironic, considering the entire point of your sitemap is to make your website more discoverable.
Instead, submit your sitemap directly to search engines.
The easiest way to submit your sitemap is to do it through Google’s Search Console sitemap report.
Login to Search Console, navigate to sitemap report, submit sitemap URL.
Note: Bing Webmaster Tools offers the same easy option.
Google only checks your sitemap when it’s submitted; it doesn’t check your sitemap everytime it crawls your site. Which is why it only makes sense to update your sitemap and ping Google when you have real updates, not to resubmit an unchanged sitemap.
Just like a hiker shouldn’t leave home without a map, a website shouldn’t go live without its sitemap.
You wouldn’t want web crawlers to get lost, would you?
Big. Small. Somewhere in between. It doesn’t matter: play it safe and create an SEO sitemap. Just don’t forget to submit it to Google 😉.
Ok, one last map joke?
What do Clint Eastwood and a map key have in common?
...
Both are legends.
How funny is that?