TL;DR
- Create a sitemap.xml at
layouts/_default/sitemap.xml - Insert the skeleton and fill it with the content you want to be discovered, using Hugo template syntax.
{{ printf "<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"yes\"?>" | safeHTML }}
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xhtml="http://www.w3.org/1999/xhtml">
YOUR CONTENT GOES HERE
</urlset>
- Include the sitemap in the robots.txt
User-agent: *
Allow: /
Sitemap: https://miriam-mueller.com/sitemap.xml
Sitemap?!
A sitemap is a file that helps search engines discover your site. You do not need a sitemap as search engine bots crawl your site and follow links on your site. A sitemap can aid the discovery however, especially if your site is new and has few other pages linking to it.
Key Knowledge
- A sitemap only aids search engine in discovering your site
- It does not control what is indexed (or not indexed)
- when is a sitemap helpful?
- large sites
- isolated pages with few links to each other
- new sites
- quick and frequent content changes
- sites excluded from indexing (robots.txt, robots-meta tag) should not be in the sitemap
- most Hugo themes automatically generate the sitemap.xml. Check it, if it matches your desired result
Custom sitemap.xml full example
{{ printf "<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"yes\"?>" | safeHTML }}
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xhtml="http://www.w3.org/1999/xhtml">
{{ range .Data.Pages }}
{{ if not (or
(strings.Contains .RelPermalink "/tags/")
(strings.Contains .RelPermalink "/year/")
(strings.Contains .RelPermalink "/series/")
(strings.Contains .RelPermalink "/search/")
(strings.Contains .RelPermalink "/contact/")
(strings.Contains .RelPermalink "/archives/")
(strings.Contains .RelPermalink "/about/")
(strings.Contains .RelPermalink "/projects/")
(strings.Contains .RelPermalink "/privacy-policy/")
(eq .Permalink "https://miriam-mueller.com/post/")
(eq .Permalink "https://miriam-mueller.com/page/")
)
}}
<url>
<loc>{{ .Permalink }}</loc>
{{ if not .Lastmod.IsZero }}
<lastmod>{{ safeHTML ( .Lastmod.Format "2006-01-02T15:04:05-07:00" ) }}</lastmod>
{{ end }}
</url>
{{ end }}
{{ end }}
</urlset>
Custom sitemap.xml explained
For my blog, I decided that I only want to index content in the form of articles.
In order to do so, I am using the <meta name="robots"> HTML meta-tag. Because of
the Hugo themes generating a sitemap including automatically generated pages like the archive or tag overviews, I
also started having my own sitemap.xml template that only includes the actual posts.
Because we are in the context of Hugo, we can use Hugo template syntax to iterate over all pages.
{{ range .Data.Pages }}: this range includes all pages, also the automatically generated ones.{{if not (or...: do not do whatever comes after theif, if one of the things inside theoris truthy(strings.Contains .RelPermalink "/xyz"): if the URL of the page considered during the loop contains xyz, it is one of the pages nothing will be done for in the if-block.(eq .Permalink "https://whatever.you.chose/post/: if the URL of the page considered during the loop is exactly matching, nothing will be done in the if-blockcontent that is written to sitemamp.xml if we have a page we want to have in there:
<url>
<loc>{{ .Permalink }}</loc>
{{ if not .Lastmod.IsZero }}
<lastmod>{{ safeHTML ( .Lastmod.Format "2006-01-02T15:04:05-07:00" ) }}</lastmod>
{{ end }}
</url>
- resulting sitemap: https://miriam-mueller.com/sitemap.xml