Skip to main content

Understanding Exclusions and Aliases

Guðrún Unnur Gústafsdóttir avatar
By Guðrún Unnur Gústafsdóttir

This article explains in detail Siteimproves exclusion and alias options. If you already understand exclusions/aliases but simply need to know where to enter them see the following article: Adding and removing content from the crawl (exclusions/aliases) 

Exclusions and aliases are how we configure the Siteimprove crawler to evaluate which links should be either excluded, considered an internal page, or considered an external page. This categorization determines how thoroughly we evaluate the link and its contents

Note: When setting up exclusions and aliases only a partial match on the link is needed. A match of "/calendar/" will apply itself to all links containing "/calendar/".

Exclusions

Excluded links are completely ignored by the crawler. It's as if the link never existed.

Excluded links will NOT:

  • Be checked for HTTP response code
  • Be checked for broken links, misspellings or accessibility issues
  • Show in the site inventory

Reasons to exclude a link:

  • The link is being flagged as a false positive broken link and ignoring the link manually would be too cumbersome
  • The resources used to check the link(s) outweighs the benefits

Aliases

Aliasing is how we determine whether a link should be considered internal or external.

Internal Pages

The link to the page, and content within the page will be checked for broken links, misspellings, and accessibility issues.

External Pages

Only the link to an external page will be checked. We will not evaluate the content on these pages. 

Default Alias Setup

By default, only the content within the folder that the crawler starts on will be considered internal. In the example URL below we are starting the crawl in the /news/ folder on the index.html file. Only content within the /news/ folder would be considered internal. 

Example URL: http://www.example.com/news/index.html

Any page that contains http://www.example.com/news/ would be considered an internal page, examples below:

http://www.example.com/news/index.html 
http://www.example.com/news/breaking/story123/ 

Any page that does not contain http://www.example.com/news/ would be considered an external page, examples below:

http://www.example.com/sports/index.html 
http://www.example.com/weather/index.html

Internal Alias

Internal links will be checked for all Quality Assurance (QA) and Accessibility issues.

Reason to setup an internal alias:

  • You need to include a page to be checked for QA and Accessibility issues but it does not match the folder of the IndexURL

External Alias

External links will only be checked for the HTTP response code. 

Reason to setup an external alias:

  • A URL matches the folder of the IndexURL however it is not required that we check for QA and Accessibility issues. We only need to ping it for the HTTP status code. 
Was this article helpful?
2 out of 2 found this helpful