Page date from CSS Selector

By Sean Needham
Page date from CSS Selector
Updating out-of-date content and ensuring that content is being reviewed on a regular basis is an important element of website maintenance.
As a web editor, you might want to focus on newly created pages, and an administrator might want to monitor pages that have been modified during the last week.
The "Page date from CSS Selector" Policy rule can parse certain date formats in the HTML code, enabling you to set up policies relating to the scenarios above.
Identifying dates on pages
Identifying the HTML element that contains the date is done using a CSS selector. The following example shows two meta tags in the head-section of a page, and the first one of them contains the date of the last modification of the page:
<head>
<title>My page</title>
<meta name="Last-Modified" content="2015-09-24T19:12:22Z" />
<meta name="Description" content="My important content is on this page" />
</head>
The CSS selector to match the "Last-Modified" meta tag would be meta[name='Last-Modified']. Policy will read the value of the "content" attribute, and use that as the date.
Supported date formats
Three date formats are supported: Variants of ISO 8601, RFCs 1123, 5322 and 2822 and Unix Epoch.
Group | Formats | Examples | Notes |
---|---|---|---|
ISO 8601 |
yyyy-MM-dd yyyy-MM-ddTHH:mm:ssK
|
2015-09-24 2015-09-24T19:12:22Z 2015-09-24T19:12:22.0000000Z |
Supports three granularities: Date, date with time, and date with time and fractions. The timezone identifier is Z for UTC, or can be provided as +01:00 or +0100. No timezone identifier is interpreted as UTC. |
RFCs |
ddd, dd, MMM yyyy HH:mm:ss GMT ddd, dd, MMM yyyy HH:mm:ss K |
Mon, 15 Jun 2009 20:45:30 GMT Mon, 15 Jun 2009 20:45:30 +0100 |
Variants are commonly used for HTTP date headers, such as the Last-Modified header. Day and month names must be in English, as per the RFCs. No timezone identifier is interpreted as UTC. |
Unix Epoch |
A large whole number |
1443052852 |
The number of seconds since 1st of January 1970 (also known as the Unix Epoch). The date will be interpreted as UTC. |
The following table shows the built-in date formats that will be accepted by Policy in a range of programming languages:
Language/environment | Format strings | Documentation |
---|---|---|
.NET | O, o, R, r | Standard Date and Time Format Strings |
RFCs | c, r | Date/Time Functions |
Java | N/A | Java doesn't natively support ISO 8601, but the Joda-Time library does. Learn more about Custom Formatters. |
What is ISO 8601?
ISO 8601 is an international standard for writing dates and times. Due to different date formats in different countries, it is very easy to misinterpret dates, e.g. confusing the numbers for days and months.
In 1988, the ISO (International Organization for Standardization) set a Global standard numeric date format, which was recognized internationally as the agreed way to represent dates:
- YYYY-MM-DD
- 2015-03-11: 11th March 2015
- 2015-11-03: 3rd November 2015
Want to learn more about CSS selectors?
Interested to learn more about CSS selectors?
Read our article CSS Selectors & Siteimprove Policy and try out the CSS Selector rule in your policies.