Skip to main content

Data Privacy: Technical Specifications

Sean Needham avatar


The Siteimprove Data Privacy module helps you to have visibility and control over your websites’ data. It allows you to demonstrate compliance with regulations (e.g. GDPRs accountability principle) by providing an overview of domains, sub-domains, and IP addresses that may belong to your organization, and by enabling searches for text-based personal data in website files and pages.

The Data Privacy module also enables you to see an overview of cookies appearing on your website.

For further information on using the Data Privacy module, see "Siteimprove Data Privacy".

This table gives an overview of the specifications and limitations of Data Privacy:

Data Privacy Functionalities Specifications & Limitations

Crawl operation

Data Privacy includes crawling of your website. The Siteimprove crawler crawls each unique URLs (pages) and detects personal data on web pages and in files.

Domains

Data Privacy crawl operations cover the domains the Customer have chosen to include in the use of the Data Privacy Solution.

Check in files  

Universal search checks the following formats:

Identifies and highlights text-based personal data in following files formats:

  • Word
  • PPT
  • XlS
  • PDF
  • JPEG
  • IMG
  • PNG
  • XML
  • ODS
  • OTT
  • SXW
  • ODM
  • ODT
  • ODP
  • OTS
  • OTP
  • TSV
  • TIF (content in metadata tags only)
  • Metadata
  • Embedded files

The following files formats will not be checked in Universal search:

Image-based PDFs

Files containing virus found on HTTP sites

ODP when password protected

Files that are password protected

Checks only possible of files of up to 20MB.

For Personal Identification Numbers and email addresses, the following formats are checked:

Identifies and highlights personal ID numbers in following files formats:

  • Word
  • PDF
  • XlS
  • PPT
  • XML
  • ODS
  • OTT
  • SXW
  • ODM
  • ODT
  • ODP
  • OTS
  • OTP
  • TXT
  • CSV
  • TSV
  • Embedded files

The following files formats will not be checked in Personal Identification Numbers and email address checks:

Image-based PDFs

Files containing virus found on HTTP sites

ODT metadata in custom properties

ODT metadata in the document description

ODP when password protected

Files that are password protected 

Checks only possible of files of up to 20MB.

Personal Identification Numbers Check

All verifications follow standardized country-specific format checks. Checks are as a standard set to one country-specific format pr. site.

Cookies

Following cookies will not be detected:

  • Interactive cookies triggered by user action
  • Cookies set on the website more than 3 seconds after landing the page and crawling it

Moreover, cookies reading user-specific conditions and user browser-information, country-specific cookies, and cookies that for different reasons are not set each time a user enters the site, will not be detected. Following cookies are examples of cookies not detected:

  • Geographic location, e.g. some cookies may be presented to users in the US only due to legislation requirements
  • Specific conditions, e.g. some cookies may be set on a specific day, after a third visit from a user, or when a user has been on the site more than two minutes.
  • Login conditions, e.g. users logged into a member’s area, or Linkedin get specific cookies.
  • Specific screen solution, e.g. iPhone users receive specific cookies
  • Information about the browser used to access a site, e.g. Internet Explorer may not see a Facebook like-button and the associated cookie to support that functionality because Facebook has not been visited previously.

Email Alerts & Notifications

Siteimprove alerts and sends notifications to Customer when the crawl operation has detected a personal identification number on the website.

Customers can edit/select the country-specific format and add relevant email-addresses for achieving alerts and notifications.

 

Was this article helpful?

0 out of 1 found this helpful