Michael Eriksson
A Swede in Germany
Home » Language and writing | About me Impressum Contact Sitemap

Correct use of dashes

Introduction

There are a number of dashes available in “proper” typography/type-setting (cf. e.g. Dashw), the most notable being the em-dash (—), the en-dash (–), and the vanilla hyphen (-).


Side-note:

The last is a disputable claim, but considering the general knowledge of type-setting and the limitations and ambiguities of keyboard input in various contexts, it is a reasonable approximation: Typically, the hyphen does service as both a minus sign and a figure dash—and very often replaces both the en- and the em-dash, too. For the remainder of this article, I will ignore other variations than these three.


This article sets out to give a simplified overview of the main uses. (Giving a complete overview will take to much space, introduce points of contention, and would require knowledge that I currently do not have.)

When to use what

Separating parts of sentences

A dash used to separate parts of sentence (e.g. “I saw you—even though you hid.”, or as a parenthesis) should always be written with an em-dash, with no surrounding spaces. The form “I saw you - even though you hid.” leads to an unfortunate use of the “-” symbol in several roles, which reduces the already too weak support of unambiguous thought in the current system of writing. Similar arguments apply to use of the en-dash, be it with or without space, and the rare em-dash with spaces.

The common German recommendation of avoiding “—” altogether in favour of “ - ” should not be followed (whether in English or German) as it lacks a strong rationale and goes counter to the above.

Indicating relationships

To indicate relationship, ranges, and similar, the en-dash is the correct choice: “Dover–Calais”, “the Tyson–Bolt match-up”, “Lunch hour: 12.00–13.00”.

Joining words

To join two words for greater clarity, e.g. “solid-state hard-drive” (over “solid state hard drive”), a hyphen should be used. The same applies to words that are slowly merging to a fixed expression, e.g. “web site”, “web-site”, “website”, but not “web–site”.

When forming a fixed compound, the en-dash is often logically correct (cf. the previous section); however, there can be many border-line cases. Consider e.g. “African?American”: If “African” is seen as an adjective, a hyphen is preferable; however, if a noun, the en-dash is better. This also affects the joining when used as an overall noun: “He is an African American.” is correct for an adjective “African”; but “He is an African–American.” when “African” is a noun. The corresponding (overall) adjectives always have a dash: “He is an African-American man.” (adjective) and “He is an African–American man.” (noun).

Everything else

For most other instance, the hyphen is usually correct; including, obviously, dividing a word at the end of a line, and setting off a prefix (“co-XXX”).

Disclaimer: I may well have over-looked a critical case, so take this statement with a grain of salt.

When not to use any kind of dash

Almost always: There is no justification to write e.g. “The—re is–no ju-st-ification [...]”.

Common sense aside: Modern English users (in particular USanians) tend to interpunctuate too little, be it with regard to hyphenation or in general. The result is that the language becomes weaker, less expressive, and more ambiguous. Considering today’s habits, my recommendation is to err on the side of too much, rather than too little: The odd extra hyphen is unlikely to cause confusion, but its absence can cause problems. (Note, however, that with a higher base-level this statement would apply in reverse. I, e.g., may have a too high base-level, where most others have a much too low.)

Correspondingly, additional rules to when hyphenation should be avoided are redundant.


Side-note:

Note that many “modern” sources suggest rules that remove most interpunctuation—not only the optional, but often the clearly needed. Because they lack a logically justification, are rooted in an attempt to make life easier for the writer (not the reader), and tend to increase the deterioration, I strongly distance myself from them.


Minus signs

Outside the scope of my three-dash fiction, the minus sign is an important case. Typically, using a hyphen is quite acceptable; however, it is worth mentioning that e.g. LaTeX comes with a separate minus sign (even a differentiation into a negation operator and a subtraction operator).

Emulating and inputting dashes

In situations where the “—” sign is not available, use two or three consequent “-”s as a workaround—I prefer three. Similarly, a “–” can be emulated by one or two “-”s—I prefer two. As stated above, surrounding spaces should be avoided. The rationale for my preference is that this makes the emulations and the vanilla dash clearly separate. Further, I am influenced by how dashes are input in LaTeX—a tool with which I have extensive experiences, and which is vastly superior to e.g. MS Word.


Side-note:

TeX/LaTeX provides a good example of how to handle dashes of varying lengths, with not only “—”, “–”, and “-” (coded with respectively three, two, and one hyphens, that are rendered roughly as displayed here), but also a separate minus sign (coded with a hyphen, which is correctly rendered based on the mathematical context). With, so far, the exception of the minus sign, I emulate this behaviour in my own mark-up language, which is used to write these pages.


A special case is HTML: At least in the modern versions, there is an enormous number of characters available—including correspondents to the dashes discussed here. Correspondingly, the vanilla hyphen should not be used to emulate em- and en-dashes. Instead, the corresponding numerical entities (— and –) can be used. This will work equally well with XML.

Another special case is when a file encoding is used where the characters exist (but are not necessarily available through individual keys on the keyboard): Here it may pay to find out how to enter the corresponding characters by e.g. multi-key combinations (unfortunately, these will vary from program/system to program/system). Notably, the various Unicode-encodings contain both em- and en-dashes, and many others besides.