Skip to content

Settings

This page describes settings for Sitemaps generation. You can access these settings by clicking the Settings button in the toolbar of the Sitemaps page of 4SEO admin.

The settings dialog has 3 tabs which are described below.

Configuration

This is the global configuration for sitemaps generation by 4SEO. Here you can globally enable or disable sitemaps generation, as well as how the sitemap is published to search engines.

Sitemaps handling configuration

Decide whether and how 4SEO should handle sitemaps. Or not.

Enable sitemaps handling

This is a global on/off switch for anything related to sitemaps within 4SEO. If you set this setting to No:

  • any reference to 4SEO sitemap in your robots.txt is removed
  • any request to the sitemap files will result in a 404

Enabling back the option will cause the sitemap build process to restart.

Minimal delay between consecutive sitemap updates

4SEO builds the first version of your sitemap as soon the first round of website analysis has completed. After that, analysis continues in the background permanently and 4SEO will detect changes in content such as pages being added, modified or deleted.

These changes may require an update of the sitemap, for instance if a page was added or removed.

4SEO also performs this update automatically, but it is good practice to avoid updating the sitemap too often, at least on larger websites:

  • while being rebuilt, the sitemap may not be available (a Service unavailable response is shown if you try to load it). Sitemap unavaibillity should be kept minimal, if it ever happens
  • on large websites (from a few thousands pages), rebuilding the sitemap takes time and resources. Your server can be ok with that, or it may be short on resources
  • Search Engines will not like being pinged about a sitemap update every 5 minutes. After a while, they may not come and load the new sitemap

So if you have a large website, and you experience sitemap loading issues in Google Search Console for instance, you should try and increase the delay between updates, especially if you don't put out valuable content several times a day.

XML sitemaps publication

Add to robots.txt

Most site have a robots.txt file that tells search engines which part of a website it's OK to explore and index, and which parts they should stay away from.

The standard way to tell search engines that your website does have a sitemap is by including a specific line in your robots.txt.

If Add to robots.txt file is set to Yes in the sitemaps settings, 4SEO will automatically insert a line into your robots.txt file to achieve that.

Each time you set Add to robots.txt file to No however, 4SEO will remove that line from robots.txt.

If you told 4SEO that your site is NOT public during the initial configuration step right after installation, Add to robots.txt file is automatically set to No

Submit to search engines

If your sitemap file is listed in a robots.txt file, you are still dependent on search engines to come across your site to discover the sitemap. A faster and more efficient way to let search engines know about your site is to directly send them the sitemap.

As Google accepts direct submission of sitemaps, and unless you configure 4SEO otherwise, it will automatically submit any new or updated sitemap to Google.

You can see if and when either came and loaded the content of the sitemap on the Sitemap dashboard.

If you told 4SEO that your site is NOT public during the initial configuration step right after installation, the Ping search engines list is automatically emptied so that no search engines is sent any sitemap.

Bing stopped accepting sitemaps submissions mid-december 2021 and so we removed Bing submission. It turned out submitting to Bing was still possible, but with an undocumented endpoint. We used that alternate method for a few months but on May 13, Bing permanently stopped accepting sitemaps submissions and so that feature was removed from 4SEO.

Included content

The following settings lets you exclude one or many pages from your sitemaps, with simple rules.

If you change any of the settings below, you will need to perform again a full site analysis so that 4SEO can again select the content you want included or excluded. Do so by using the Reset analysis button on the Site analysis tab of the Pages page settings.

Exclude archived content

By default, 4SEO will exclude from your sitemap archived content as it is a waste of search engines crawl budget to spend it on out of date content.

You can change this behavior here globally for the site, if you want to suggest to search engines to crawl your archived content.

This setting acts globally. You can also add one or more Sitemap rules

An archived status exists for Joomla articles or categories but may not exist for other types of content.

Excluded URLs

Use our simple URL specification syntax to specify which website pages or group of website pages you want to exclude from the sitemap. For instance:

/legal/{*}

will exclude all pages with URLs starting with /legal/ such as /legal/privacy or /legal/terms.

Be sure to exclude all pages that do no bring value to your site content, and for which you want to rank in search engines results. Any other page such as contact information,

Images

The settings let you include detected images in your content into your sitemap and configure how 4SEO should select them.

4SEO will only search for images within pages that are public and not restricted by robots.txt directives or noindex meta tags.

Include images

Image dimensions

Wether for regular pages or images, it is usually best to only include relevant and useful images to your image sitemap, avoiding any purely decorative or functional images such as stock images or buttons icons.

By default, 4SEO will only include images that are wider than 200 pixels. Under the Images tab of the Sitemaps settings, you'll find 2 input fields to set, or remove, a minimal width and/or height for any image to be included.

Exclude categories

On many sites, category pages shows the first few paragraphs of the article or items it contains, with a Read more or similar link to visit the full article.

Often, this summary will also include one or more images, and it means the same images can be found both in the article they belong to but also on the containing category pages.

This may negatively affect the article ranking for this image. As we usually want the article/item to rank rather than category pages, 4SEO will not include images found on category pages, by default.

You can change this behavior with the corresponding setting under the Images tab of the Sitemaps settings.

You'll need to Reset analysis for this setting to take effect right away.

Excluded URLs

Use our simple URL specification syntax to specify which images or group of images you want to exclude from the sitemap. For instance:

/images/icons/{*}

will exclude all pages with URLs starting with /images/icons/ such as /images/icons/open.svg or /images/icons/close.svg.

https://someothersite.com/{*}

will exclude all images found in your pages but coming from another website.

Excluded domains

This options lets you exclude from inclusion in the sitemaps all images which URLs are on the specified domains.

Enter one domain per line, exactly as they are listed in your content. Do not use {*} or {?} wildcard characters. Here is an example configuration:

assets.weeblr.net
icons.some-provider.com
unsplash.com

Images hosted on these domains will not be included in the sitemap, regardless of whether they use http:// or https://.