What Is A Sitemap And How Do Sitemaps Work? | Lead Genera
SEO

What Is a Sitemap and How Do Sitemaps Work? 

This article focuses on what a sitemap is and how sitemaps work.

We will also walk through how to create a sitemap for your website.

Table of contents:

    What Is a Sitemap?

    An XML sitemap is a file that tells search engines what pages, videos, and other files are on your website.

    This is in order to show the search engine how important they are, along with other vital information about the pages and files on any given website.

    With a sitemap in place, search engines can crawl websites better and consequently index your site more efficiently so that people can find it more easily.

    This means that sitemaps can encourage good SEO.

    Now that we know what is a sitemap, next we will discuss how sitemaps work.

    How Do Sitemaps Work?

    Sitemaps contain a list of URLs for a website, with extra metadata about each URL.

    Metadata usually contains information about the URL about its last update, the frequency of update, and it priority compared to other pages.

    Search engine crawlers find pages from links within sites (internal links), and from other sites (backlinks). Crawlers can detect URLs in the sitemap, gaining information about those URLs from its metadata.

    But, using a sitemap does not promise that pages are added into search engines.

    Sitemap Format

    Your XML sitemap file should be UTF-8 encoded.

    Here is an example of a sitemap containing one URL:

    <?xml version=”1.0” encoding=”UTF-8”><urlset xmlns=”https://www.sitemaps.org/schemas/sitemap/0.9” xmlns:xhtml=”https:www.w3.org/1999/xhtml”>   <url>      <loc>https://www.example.com</loc>      <lastmod>2017-10-06</lastmod>      <changefreq>weekly</changefreq>      <priority>0.9</priority>      <xhtml:link rel=”alternate” hreflang=”en” href=”https://www.example.com”/>      <xhtml:link rel=”alternate” hreflang=”fr” href=”https://www.example.com/fr”/>   </url></urlset>

    XML Tag definition

    <urlset> is the current standard protocol.

    <url> is the parent of tag for each URL entry. Anything under this are children of this tag.

    <loc> is the URL of the page, which contains the protocol (http) at the start and ends with a forward slash (/), if needed by your web server.

    Must be less than 2,048 characters.

    <lastmod> is the date when the page was last modified. It must be in W3C Datetime format, and use YYYY-MM-DD.

    <changefreq> is how often the page will change. Valid values you can type are: always, hourly, daily, weekly, monthly, yearly, never.

    These values are not command, so pages might be crawled less frequently than what you have put.

    ‘Always’ should be used for files that change each time it is accessed. ‘Never’ should be used for archived URLs.

    <priority> is the priority of the URL compared to other URLs on your site.

    Valid values can range from 0.0 to 1.0, but the default value is 0.5. It lets crawlers know which pages you think are more important, and so on.

    Assigning a priority does not change the ranking of that page in the search engine result pages (SERPs). You can use this tag to make sure that your most important pages appear in the search index.

    Now that you know what is a sitemap and how sitemaps work, next we discuss when you should use a sitemap.

    When To Use a Sitemap

    Sitemaps are especially useful by improving the crawling process for:

    • Large websites with many pages, or a new website with few links to it.
    • The website is updated frequently and you want search engines to know about the new content right away.
    • You have pages that are not easily found by search engine crawlers, such as pages that require users to login or fill out forms.
    • The website contains lots of videos and images. There is a file size limit of 50MB in a sitemap.
    • Your website is shown in Google News.

    However, you might not need a sitemap if your website is:

    • Small. But, you will need a sitemap when your website starts to grow.
    • Is extensively internally linked, meaning that Google can find all important pages by following links from the homepage.
    • You don’t have lots of media files or news pages that you want visible on the search engine results pages (SERPs).

    How To Create a Great Sitemap

    Content Management Systems (CMS) like WordPress provide sitemaps by default or as an option, through plugins like YoastSEO.

    This means you do not have to make a sitemap from scratch because they already create one for you automatically.

    Creating a sitemap is easy.

    You can create one manually, but it will take a very long time.

    It is best to use tools like XML Sitemaps Generator, to create sitemaps for Google, Bing and other search engines.

    Once your sitemap is created, you need to submit it to search engines like Google and Bing. You can do this through their webmaster tools.

    Follow these steps when creating your sitemap:

    1. Choose which pages to crawl and the canonical version of each page. Do not include URLs from your robots.txt file, need a login, or are password protected.
    2. Identify if you need more than one sitemap. You will need another sitemap if your current sitemap has over 50,000 URLs.
    3. Follow the sitemap format, with XML tags and so on.
    4. If you have more than one sitemap, make a Sitemap Index file and put all the links to each sitemap.

    Entity escaping and URL Encoding

    Data values in an XML sitemap file, and in all URLs, should use entity escape.

    Here are data value examples, along with their corresponding entity escape:

    • Ampersand (&) – &
    • Single Quote (‘) – '
    • Double Quote (“) – "
    • Greater Than (>) – >
    • Less Than (<) – <

    All URLs, including the sitemap URL, must use URL escaping and encoded for readability by the web server the site resides in.

    This is because all URLs need to follow the syntax identified in the Uniform Resource Identifier (URI) specification.

    The special characters that you can only use in URLs are:

    • Letters and numbers
    • Unreserved characters – _ . ~
    • Reserved characters ! * ‘ ( ) ; : @ & = + $ , / ? % # [ ]

    Here a list of characters you cannot use, along with its corresponding encoded value that you should use instead:

    • Spaces – %20
    • ” – %22
    • < – %3C
    • > – %3E
    • # – %23
    • % -%25
    • | – %7C

    To make thing easier, you can use a URL encoding tool.

    How to Submit Your Sitemap on Google

    1. Log into your Google Search Console account.
    2. Click “Sitemap”.
    3. Type your sitemap URL. If you have a sitemap index file, only type in the URL of your sitemap index file.
    4. Click “Submit”.

    How to Submit Your Sitemap on Bing

    1. Log into you Bing Webmaster Tools account.
    2. Click “Sitemap”, then “Submit Sitemap”.
    3. Type your sitemap URL. If you have a sitemap index file, only type in the URL of your sitemap index file.
    4. Click “Submit”.

    Where To Find Your Sitemap

    You might already have a sitemap, but don’t know it.

    So, checking if you have sitemap before making one would save a lot of time and effort for you.

    If you create web pages using WordPress, type in yoursitename.wordpress.com/sitemap.xml and replace ‘yoursitename’ with your site name.

    For custom domain websites, type in yourdomainname.com/sitemap.xml and replace ‘yourdomainname’ with your domain name.

    A custom domain name is basically your own website address.

    For example, a custom domain would be ‘yoursitename.com’, not ‘yoursitename.wordpress.com’.

    If you still cannot find your sitemap, you can try replacing /sitemap.xml and with these other options:

    • /sitemap-index.xml
    • /sitemap.php
    • /sitemap.txt
    • /sitemap.xml.gz
    • /sitemap/
    • /sitemap/sitemap.xml
    • /sitemapindex.xml
    • /sitemap/index.xml
    • /sitemap1.xml
    • /rss/
    • /rss.xml
    • /atom.xml

    Conclusion: Sitemaps Are Good For SEO

    We have covered what a sitemap is and how sitemaps work.

    It can lend a hand with SEO by helping search engines find all the pages on your website and can be indexed in search engine result pages.

    They are especially useful if you have a large website or anew website with few links to it.

    If you update your website frequently, you can also use sitemaps to let search engines know about the new content right away.

    Lastly, sitemaps can be helpful for pages that are not easily found by search engine crawlers.

    Our team at Lead Genera can help with SEO or sitemaps, just chat with us.

    Sign up to our newsletter if you want to receive more marketing insights from us.