Skip to main content

Readability: How has Siteimprove identified content on my page?

By Guðrún Unnur Gústafsdóttir

When checking for content on a web page, Siteimprove excludes page templates or boilerplate language on your web pages. Boilerplate or template elements are usually standard on every page and may include menus, banners, headers, footers and columns. Siteimprove will only check for content outside of the template or boilerplate content to calculate a page's readability score. 

Automatically extracting content from unique websites is complex and in many ways dependent on how the website has been created. There are some elements on web pages which are not seen as content due to them making the calculation of a readability score near impossible. Tables and bullet lists of short texts are excluded as they are seen as paragraphs containing very few words and characters. 

Generally speaking pages with very light content result in high readability scores if long or polysyllabic words are found. Pages should always be reviewed manually independent of the readability score that has been returned, using the highlights as markers. Therefore, Siteimprove's readability algorithm will use best-effort to identify content on your page.


Additional resources on Readability:

Was this article helpful?
0 out of 0 found this helpful