Adding and removing additional domains, folders or individual pages is handled using aliases and exclusions. By adding alias and exclusions we can control how the Siteimprove crawler evaluates pages that match our entries. This article is intended to inform you what alias and exclusions are respectively and how to add them to your site.
Note: When setting up exclusions and aliases only a partial match on the link is needed. A match of "/calendar/" will apply to all links containing "/calendar/".
Exclusions are commands that are used to tell the Siteimprove crawler to completely ignore a URL as if it never existed. URLs that match an exclusion will not be evaluated or included in your Siteimprove inventory in any way.
Excluded URLs will NOT:
- Be checked for HTTP response code
- Be checked for broken links, misspellings or accessibility issues
- Show in the site inventory
Reasons to exclude a URL:
- The URL (link) is being flagged as a false positive broken link and ignoring the link manually would be too cumbersome
- The resources used to check the URL(s) outweigh the benefits
- The URL (link) are not a priority when fixing issues on your website
Aliases are commands that are used to tell the Siteimprove crawler what is "internal" to your site and what is "external". Aliases are used to identify domains and/or subdomains that are different from your site's main domain name. These domains are then considered part of the main site, and are included in all checks that are performed on the site. When setting up an alias you have the option to include whether links that match your alias should be considered internal or external.
An internal page/link is something that we want to check for broken links, misspellings and accessibility issues. It is something you are responsible for and is considered a part of your site.
Note: A link to the aliased domain must exist on the website for our crawler to index it. If the link is not available, then contact Technical Support who can add an 'extra index URL' to achieve the same purpose.
An external page/link that you want to make sure exists (is not a broken link), but you do not necessarily want to check the content within the page itself.
How to add an Alias or Exclusion on your site/s
- Select Settings from the main menu
- Select Content from the side-bar menu
- Select Crawl Settings from the sub-menu
- Click on the site that you would like to add Exclusions and/or Aliases on
- Click Exclude or click Alias, depending on what you want to set up
- If you are setting up an Exclusion, type in the URL exclusion match and click Create exclusion
- If you are setting up an Alias, only a domain name is required. Typing in example.com automatically ensures that all subdomains are included; e.g. www.example.com, news.example.com, and any other subdomains that you may have. Conversely, if you identify a subdomain by typing in the alias news.example.com, only this subdomain will be included. In both cases, aliases are only crawled if a link exists between your main site and the domains/subdomains identified on the page. Indicate whether links/pages that match the alias will be considered internal or external. If you do not select the Crawl as external content box, content will be determined as internal.
- Click Create alias to finish