Michael Eriksson
A Swede in Germany
Home » Meta information | About me Impressum Contact Sitemap

Technical notes

Page generation

At the moment

At the time of writing, 2010-04-26, the contents of this website are generated in a two-step process from a markup that comes close to plain-text. In the first step, XML files are generated from the original markup, extended with various kinds of meta-information gathered from other sources. These files are still “layout independent”. In the second step, XSLT is used to transform the XML files into the XHTML eventually presented to visitors, including generation of headers, footers, navigation, etc. (The exact layout and positioning is mostly determined per CSS at the time of viewing.) Some other pages and documents are generated in parallel (partially using the same, partially other mechanisms), notably including the various human- and machine-readable versions of the sitemap.

Background and earlier versions

Originally, I planned to find an existing CMS that allowed me to manipulate individual text files on the file-system level (as opposed to forcing me to edit via a browser). Just to get started, I threw together a little shell script with a few sed statements to generate very basic HTML from some preliminary files I already had. After a few weeks, when I still had not gotten around to look for a CMS, I came to the conclusion that this script went a lot further in covering my needs than I had anticipated; in particular, since I already had made several easy changes and enhancements.

I discarded the CMS plans and continued to develop the script as needed, buying greater flexibility and ease of use (not to mention more fun and a greater satisfaction) with a greater effort. Other benefits include that I am able to serve all pages statically (and therefore much faster than with a conventional CMS), and that I do not have to make major adaptions for a CMS specific markup: The template language that I use comes very close to how I write my texts anyway (e.g. quotation marks, emphasis, dashes). Headings, tables, and similar require additional markup; however, even this is uncomplicated and largely by preference—it would be possible to eliminate it almost entirely, were I willing to sacrifice the logical, LaTeX-like structure.

As time went by and more features were added, this solution became too complex and hard to maintain, largely because the implementation had next to no context awareness and used raw text streams instead of structured data. After considering a few alternatives (including using/writing a parser), I settled for a comparatively simple translation into XML, with the brunt of the work being done using XSLT on the (automatically structured and context aware) XML files.

Other technical aspects

Apart from this, I have only had to add CSS, as the web server is run automatically by my ISP, freeing me from the need to worry about security updates, appropriate configuration, etc. (I know from my professional experience that this can be a hassle; in particular, when a non-trivial version change in the server is needed.)

Keywords

For a long time, I tried to always provide keywords in the <meta name="keywords" sense (mostly interesting to search-engines). While this is a sound idea, it proved to be psychologically problematic: The extra effort of thinking through the contents of the text and generating keywords often made me postpone the writing or publishing of an article. Correspondingly, most newer pages will be left without keywords for the time being.

Other factors playing in were the probably low use of the keywords among search-engines (and the definitely low use among human users) and the problem of keeping keywords reasonably consistent.

Possibly, I will make more “global” efforts on some time(s) in the future, providing consistent keywords for a greater number of already existing articles.

Further information

Further information on the technical topics is available in my discussions of manual checks and automatic validation.

A separate page deals with the history of this website.