This article is intended to provide you with tips and tricks on how to best set up a new site on your Siteimprove account.
Account Owners can add new sites to Siteimprove directly in the Siteimprove Intelligence Platform.
Setting up a site is a fairly straightforward process, but can be tricky when it comes to the details. Be sure to follow these instructions when adding a new site to your Siteimprove subscription.
1. Check that your Index-URL (the URL where you let the crawler start) actually works.
If it redirects or if it gets to an error page, then the crawl will not work as expected.
2. Make sure you set up the site to pick up all the pages you want.
If you set up a crawl on https://www.example.com, then it will NOT crawl https://subsite.example.com and it will also NOT crawl https://example.com. In order for these domains to be crawled you either need to set up a second site with a different Index-URL or add an internal alias of "subsite.example.com" or "example.com" so pages are seen as being part of the same original site.
If you set up the site starting at a subdirectory, then Siteimprove will only crawl that subdirectory.
Example 1: A crawl starting at www.example.com/at/de/business.html will crawl everything that is in www.example.com/at/de/ and its subdirectories. Note that the last "/" is always the most important in terms of what is crawled and what is not crawled.
Example 2: A crawl starting at www.example.com/at/de will crawl everything that is in www.example.com/at/ and its subdirectories.
3. Make sure that you do not pick up and index pages you do not want to crawl.
Once you have set up your new site and it has crawled, you might notice that Siteimprove indexes pages that you do not feel you need to have checked. You can check which pages you have in your index by either:
a) Searching for it with the search box (magnifying glass) in the upper right corner in the Siteimprove Intelligence Platform
b) Having a look in Quality Assurance -> Inventory -> Pages
c) Getting an overview in Quality Assurance -> Inventory -> Sitemap
If you find pages or sub-directories that you do not want to check, you can setup to not check them by either using exclusions or aliases. Exclusions should be used if you do not want to check links anymore (for errors), aliases should be used if you do not want to index the pages anymore. Learn more about how exclusions and aliases work.