Skip to content

Sitemaps

4SEO can create, update and submit to search engines an XML sitemap entirely automatically. You can also manually add or remove pages from the sitemap to your liking.

Sitemaps are used by search engines for 2 mains reasons:

  • to discover pages that are not naturally linked from other parts of your site
  • to help in deciding which page is the canonical one in case the same content can be accessed through multiple URLs

For larger sites, they can also help in deciding which pages will or will not be crawled, or faster, by prioritizing canonical pages.

As 4SEO crawls your entire site, it can easily build an exhaustive list of pages of your site. But if we were simply putting all those pages in the sitemap, this would be a terrible job from an SEO standpoint.

So 4SEO pays a lot of attention to only including the right pages in your XML sitemap, for the best SEO benefit.

Sitemaps should nearly never contain all pages on a site

Only important pages, with SEO values should be listed. Likewise, if some pages can be accessed through multiple URLs, only the one single canonical URL should be listed. And of course, pages with noindex meta tags or that are excluded through a robots.txt directive should never be included.

In the same way, a sitemap made up of only links from your menus is uselss: they are all discovered immediately when the home page is loaded and do not add information about canonicalization.

Sitemaps generation

4SEO will generate a sitemap automatically once a site analysis has completed. This cannot happen earlier as 4SEO must crawl the entire site to identify canonical URLs, and also perform checks such as:

  • presence of noindex meta tag
  • exclusion of a page with a robots.txt

Once a sitemap is ready, it is automatically published. However, 4SEO also constantly monitors the site for changes and addition of new content.

When such a change or addition is detected, the new or modified page is analyzed again. New pages are added to the sitemap file if they comply with all requirements. Modified pages are checked again and may be added (if they previously were excluded) or removed (if they don't match the requirements anymore) from the sitemap.

The Sitemaps page has a dashboard that tells you all about the sitemap status:

View of a sample 4SEO sitemap dashboard

Smaller sites, with less than 500 pages, basically do not need a sitemap, at least per Google's documentation. They are small enough that search engines will crawl them entirely anyway.

However, even in that case an SEO-oriented sitemap can be useful if it properly indicates canonical and important pages only.

Manually adding or removing pages from sitemaps

While the sitemap process creation is entirely automatic, including selecting canonical URLs and checking robots.txt file exclusion, you can manually remove or add back individual pages as you see fit.

Go to the Pages page and use the Sitemap column to exclude/include pages:

View of a sample 4SEO pages list with sitemap selector highlighted

Sitemaps publication

Once a sitemap has been created, or each time it's updated (pages added or removed), 4SEO will publish it.

There are 2 ways your site can advertise the presence of a sitemap file: through your robots.txt file and by directly submitting it to search engines.

4SEO does both, unless you tell it otherwise using the Settings toolbar button of the Sitemaps page.

Add to robots.txt

Most site have a robots.txt file that tells search engines which part of a website it's OK to explore and index, and which parts they should stay away from.

The standard way to tell search engines that your website does have a sitemap is by including a specific line in your robots.txt.

If Add to robots.txt file is set to Yes in the sitemaps settings, 4SEO will automatically insert a line into your robots.txt file to achieve that.

Each time you set Add to robots.txt file to No however, 4SEO will remove that line from robots.txt.

Tip

If you told 4SEO that your site is NOT public during the initial configuration step right after installation, Add to robots.txt file is automatically set to No

Submit to search engines

If your sitemap file is listed in a robots.txt file, you are still dependent on search engines to come across your site to discover the sitemap. A faster and more efficient way to let search engines know about your site is to directly send them the sitemap.

Both Google and Bing accept direct submission of sitemaps. Unless you configure 4SEO otherwise, it will automatically submit any new or updated sitemap to these search engines.

You can see if and when either came and loaded the content of the sitemap on the Sitemap dashboard.

Tip

If you told 4SEO that your site is NOT public during the initial configuration step right after installation, the Ping search engines list is automatically emptied so that no search engines is sent any sitemap.