Michael Eriksson
A Swede in Germany
Home » Software development » Webdesign | About me Impressum Contact Sitemap

The abysmal design of online news papers

Introduction

I am continually amazed by the idiotic designs that many online news papers use: Over-freighted, chaotic, border-line unreadable, ... I admit that they face complications that I do not, say generating enough advertising income to justify an online edition in addition to (or instead of) a paper edition. Nevertheless, the way they go about it is idiotic.

A disastrous example

Consider e.g. an article on the 2009 Swedish success in the French Opene:

  1. The article is 25 lines, 234 words, and 1428 characters long (not counting all headings and legends)—to an overall size of the webpage of 3114 lines, 12553 words, and 203133 characters! That is: More than 120 times the number of lines, almost 60 times as many words, and more than 140 times the number of characters!

    (Numbers as found at the time of original writing and calculated with some approximation by use of the Unix tool wc. Note that a comparison of lines and words is only semi-relevant when factoring in mark-up—the characters are the most telling.)

    In contrast, the same character counts for my start page (on 2009-06-02) are 5882 vs. 10203—less than 3/4 of additional overhead. This overhead includes several navigational structures and, obviously, the actual HTML. Note how my “content count” is more than four times that of the external article, but my “overall count” is some twenty (!) times less. A truly minimal, HTML compatible, version of either page would have been able to get away with considerably less overhead: A minimal, contentless, strict XHTML (what I use) gives 263 characters; a non-validating “ur-HTML” which most (all?) browsers will handle correctly needs 28 characters. Cf. below.

    Notably, this is the pure article page, alone. Add in style-sheets, images, Flash, whatnot, and the true size of the external article explodes: Downloading it and checking the local size yields almost 1900 KB or an increase by more than a factor nine.


    Addendum:

    Looking back from a 2023 perspective, these numbers seem far less remarkable than back then. Yes, the page was horrifyingly incompetently made, and I would make the same claim about such a page published even today. However, even worse pages are quite common today. There are some sites with pages that exceed that overall 1900 KB (< 2 MB) just with the page it self, without including images and whatnot—and, no, these are not pages with 2 MB of actual text, they are pages with massive and utterly disproportionate CSS, JavaScript, HTML-tag, whatnot overhead within the actual page. (As a comparison, 2 MB of readable text is approximately half a Bible and, at the time of writing, comparable to this entire website.)


  2. Running it through an external validator (http://validator.w3.org/e) yields “16 Errors, 5 warning(s)”. Using my local tools that I apply to my own work (tidyw and xmllintw), I have some fifty errors and warnings for the same page...


    Addendum:

    Embarrassingly, the original version of this page was faulty too. I had so far relied on the local tools, which, apparently, are not fool-proof. Using the online validator above, I found two instances of an automatically generated <ul> </ul> without any content, which is not allowed. A further dozen or so sample checks of other pages found that I had built the page internal links and references incorrectly (“+” is apparently not allowed in the name attribute of an a tag).

    In my defense, my website is a few months old at the time of writing. Aftonbladet’s has been around since 1994 (according to http://koncernen.aftonbladet.se/tidningen/tidningen_historik/article3674.abe)—and it has much larger resources than I do.


  3. Looking at the core content, we can see that about a third is taken up by an image of a smiling man and the tennis court, which brings absolutely nothing in content. (Incidentally, the man is sufficiently well known in Sweden that a picture seems superfluous already for that reason.) A further (rough guesstimate) sixth of the remainder is taken up by an attempt to make the reader write about the article in his blog. (No thank you! If I do, it will just be to ridicule it—as this page testifies.)

  4. At the left of the article is a navigational structure that stretches 5–6 pages down after the true content has ended. The true content, in turn, covers roughly one page... (“Page”, here, is to be understood as the area visible on the screen at a particular time. This will vary depending on screen size, height of browser window, and other circumstances. The proportions are still telling.)

  5. The same applies, m.m., to the right, where a set of links and images with no direct relevance to the article can be found, reaching the same depth as the navigation to the left. Roughly half the links deal with soccer (!); none with tennis.

  6. The page contains at least three animated gifs (or other moving images) in intrusive adverts, and no less than seven (7!) Flash-animations. One of these is in a fix position on the screen, irrespective of how far the window has been scrolled. (Presumably using CSS position: fixed, which is almost always a bad design decision.) This animation is an intrusive advert that makes the page hard to read. The others are similarly intrusive; and only one has any connection with the article, the others are advertising. Notably, by the time I had them all activated, I had so many distractions on the screen that I, literally, would not have been able to read the article. Even just looking at the page made me so distraught (not merely distracted) that I could barely stand the situation—I felt a sense of physical relief when I closed the window. (Normally, I surf with Flash and JavaScript off, and images loaded on an on-demand basis.)

  7. The browser that I used, having this one page open, was constantly at about 95 % (!) of my CPU’s limit. (A 2.6 GHz Celeron: Not, I admit, a state of the art CPU; however, we are talking about watching one single webpage! Seeing that I regularly, and with the load hardly even registering, can have dozens of text-only webpages open at the same time...)


    Addendum:

    Here it would have been helpful to mention the browser in detail, as the fault might have been partially with the browser. (In the sense that another browser might have made something better of a bad situation.) By now, 2023, I cannot supply this information with certainty, let alone detail, but it was likely a then-current version of Opera running under a then-current version of Debian, if I used my main browser. However, it is conceivable that I switched to another browser (likely Firefox, if so), in order to not have to alter my default settings too much.


  8. Loading the page (with images, but without Flash) took me 28 (!) seconds in a trial with a freshly emptied browser cache. This amounts to roughly 8 (!) words/second if we look at the actual readable and relevant content of the article; alternatively, roughly 50 characters/second. Modemsw capable of higher speeds were available before I was born (in 1975)! In effect, a user reading an article of similar length on an early plain-text, electronic bulletin board, would have finished his down-load after about the same time...

To put in plainly: If I were given the choice between having to watch a page like this one with all images, JavaScript functionality, and Flash-snippets activated, or to not watch it all, I would unhesitatingly choose not to watch it at all. Even with my normal bare-bone, text-only settings, pages like this is truly on the border of what I will tolerate even from a free-of-charge website. The contents are rarely worth the visit. “Geschenkt ist noch zu teuer.”, to use a popular German saying: Too expensive, even if a gift.

Excursion on overhead needed

In the original version of this page, I gave a hundred characters as a rough guestimate of the minimal overhead needed. Later, I decided to investigate this in more detail.

First, I cut out all content and dead-weight (including unneeded white-space) from my start page. This is the result (263 characters):

<?xml version="1.0" encoding="utf-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head><title></title></head><body></body></html>

Then I went down to the barest minimum likely to be understood by a browser (27 characters):

<html><body></body></html>

Note that the latter example is not something that I would recommend as a basis, but it actually gives fewer validation errors and (filled with the corresponding text without additional markup) is likelier to be correctly displayed in a typical browser than the article discussed above. The major stumbling block is character encoding, but someone using only ASCII in a Western context should be OK, because these characters are mostly identical in the various Western encodings.

In both cases, a minimal, bare text, page with no additional overhead could be built by simply putting the text in the body tag. The result would not be something to write home about, but, again, better than the discussed article.