Creating Semantically-Rich OpenPublish Themes

What Is this Semantic Business About and Why Should You Care?

The World Wide Web started as a collection of hyper-linked documents. In the early days of the Internet you would go to a website that you knew had the desired information, drill down into the content by following hyperlinks and hopefully find what you were looking for. However, over time the amount of content on the Internet, number of publishers and the diversity of the information consumers snowballed, making explicit consumption of content impractical at best, if not impossible. At some point there were too many content sources and not enough time for humans to browse through them. This was the time when the rise of search engines began.

The emergence of search engines as an important "audience" for web-sites exposed a slew of problems with the way web-sites had been built. When pages were built for humans the HTML markup for those pages was used solely to achieve the desired look-and-feel. Alas, computer software behind the search engines, does not have the same power of visual perception that humans do. Looking at a proper user-interface most humans can very quickly identify structural elements of a page like: an article's title, its author, its content. It is much harder and more error-prone for computers to do the same. On the other hand computers can analyze and digest way more information than humans can. And generally they don't need to rest, sleep or have a habit of procrastination during work. That's why we want machines to "understand" content on the Web better, and make our web experience richer (which, of course, should leave more time for our procrastination).

In order to make it easier and less error-prone for computers to understand the content on the web, pages must expose the structure of the content. Exposing the structure, the semantics of content on the web is the main goal of the Semantic Web movement (a.k.a: The Next Big Thing, The Web 3.0).

Following is how Wikipedia defines Semantic Web:

"Semantic Web is a term coined by World Wide Web Consortium (W3C) director Sir Tim Berners-Lee. It describes methods and technologies to allow machines to understand the meaning - or "semantics" - of information on the World Wide Web.

According to the original vision, the availability of machine-readable metadata would enable automated agents and other software to access the Web more intelligently. The agents would be able to perform tasks automatically and locate related information on behalf of the user".

If you still have doubts whether you care about semantic markup or not, let us say that designing your theme so that it exposes the semantics of your content really helps your Search-Engine Optimization (SEO) efforts and that could be the least of the benefits you get from relatively little investment.

Semantic Theming in OpenPublish

Now that we, hopefully, got you all jazzed-up about Semantic theming, let's see what OpenPublish does to help you with it.

The pre-dominant way OpenPublish tries to expose semantics of the content is throughRDFa. "RDFa (or Resource Description Framework – in – attributes) is a W3C Recommendation that adds a set of attribute level extensions to XHTML for embedding rich metadata within Web documents." [Wikipedia].

At the time of this writing the two vocabularies used in OpenPublish are those from: Dublin Core and Common Tag. You can add other RDFa vocabularies, in your themes, or add support for Microformats: another common way of exposing semantics of content. Microformats can be used alongside RDFa, it does not have to be one or the other.

Current implementation of RDFa in OpenPublish is almost entirely in the theme layer. The default OpenPublish Base Theme contains most of the RDFa markup. This gives great flexibility when sub-theming and allows OP Theme to serve as a good reference implementation but does not provide full automation. When sub-theming, you will have to make sure that RDFa markup does not get lost in all places it was present in the base theme before overriding and that it is present in any new TPLs you may develop in your theme. Further in this document we describe how OpenPublish Base Theme incorporates RDFa which should serve you as a guide for implementing the same in your theme(s).

Drupal 7 has RDF and RDFa support in core. When OpenPublish is upgraded to Drupal 7 (estimated: second quarter 2011) we will be able to automate many of the manual tasks in exposing RDFa.

Examples of RDFa In OpenPublish.

The very first thing you need to do to start using RDFa is set proper document format and link to proper vocabularies. Basically, your theme's <html> tag should look something like the following:

The next most important think is to use use proper mark-up for content titles, authors and descriptions. It's also very important (both for Semantic markup as well as SEO) to use proper HTML tags for those. For a title of the current page (e.g. detail view of an article) markup should look like:

Sample markup of content author(s), published date and content body:

Note: date is in RFC 3339 which is a subset of ISO 8601 and can be obtained simply by indicating "c" as the format to PHP's date() function.

If you have a listing page of content items (e.g. latest headlines, latest blog posts or articles listing), H2 tag should be used for title and "about" attribute should point to proper detail page of the content, you may also want to show teasers (abstracts):

API Functions and implementation in OpenPublish

As of the latest version of OpenPublish, these functions are used in certain node and views preprocessors to get RDFa-enhanced titles, authors and publication dates.

The preprocessing for node display happens in openpublish_core/theme_helpers/, where you'll see an appropriate .inc file for each content type.

To get this done with the associated views, field-level template files were created (views-view-field-blogstitle.tpl.php, views-view-fieldarticles-created.tpl.php, etc...) and field-level preprocessing was added to the theme.inc in each corresponding feature (op_articles for example).

If you've implemented your own theme based on the OpenPublish one, you can easily get the benefit of the resulting preprocessed variables by creating the following TPL files and replacing what would normally be $output with $rdfa_title, $rdfa_created or $rdfa_author.

Keep in mind that you'll also want to add some proper heading tags in some of these cases, so please refer to the following files in themes/openpublish_theme/views/articles/ to see how this is done for articles:

What's Next?

There are a bunch of other RDFa attributes and Microformats that could enrich a publishing site. In this first implementation we are just touching the basics. As support for semantic markup improves and grows in OpenPublish we will keep updating this document.

Documentation

Loading widget...
No files to show
{{node.name}}
({{node.children.length}})
{{node.date}}
{{node.modified}}
{{node.filesize}}
{{node.filename}}