How to prevent ugly line breaks
The problem
Everybody has already seen them: Inept line breaks that significantly disturb the
appearance of the whole text. Alarmingly often you see e.g. prices like 49.90
€
oder physical constants like 17.5
min. In French texts (where punctuation marks are
usually separated from the preceding word by a space !) you can sometimes find a
punctuation mark standing lonely in an own line
!
And some time ago I even saw that also Irish names like O’
Rourke can be affected by
this issue …
Missing line breaks can cause ugly texts aswell, especially if your text is justified. Whenever a long word falls into the next line, the words before it can be pulled far away from each other. This rarely affects English texts where already “significantly” is considered a long word, but you can see it quite often in languages where words can be put together, e.g. in German (“Rechtsschutzversicherung”, legal protection insurance) or in Finnish (“Järjestelmävaatimukset”, system requirements).
But now let’s stop looking at horrible examples:
All these things can be avoided!
Before we learn the “real” solution, just a little paragraph about a method I do
not recommend:
It suggests itself to get rid of the latter of all these
phenomenons by just inserting normal hyphens. This looks great at first glance, but it leads to
another common atrociousness, if you alter your text or layout later: You then have the risk
that in some long words a hyphen re-mains some-where in the middle of a line …
The solution
The theory is quite easy: You just have to tell the computer where a line break is allowed and where it has to be avoided. And to make that possible, there are some characters which have been invented for this very reason:
Character | ANSI | Unicode | HTML | Line break | visible |
---|---|---|---|---|---|
Line break | * | * | <br> |
always | no |
Space | 32 | U+0032 | Whitespaces | possible | as space |
Hyphen-Minus | 45 | U+0045 | - |
possible | yes |
No-Break Space | 160 | U+0160 | |
no | as space |
Soft Hyphen | 173 | U+0173 | ­ |
possible | when breaking |
Non-Breaking Hyphen | — | U+8209 | ‑ |
no | yes |
* The encoding of a line break depends on your operating system.
The first three characters in the table are the “normal” ones everybody knows. You can simply type them using your keyboard. About the others some comments seem necessary:
No-Break Space
The No-Break Space looks just like a normal space, but it doesn’t allow a line break – thus it solves the problem of the breaking units (49.90 €, 17.5 min, …) and punctuation marks (Yeah !).
Soft Hyphen
The Soft Hyphen is a kind of ‘syllable division mark’; it marks the position where a line break is allowed within a word. If you insert a Soft Hyphen at a reasonable position within words like “equivalent” (“equi¬valent”, where “¬” symbolizes the Soft Hyphen), the lack of line breaks is repaired. As long as the word stays together in one line the Soft Hyphen remains invisible; but if the word is divided, the desired hyphen appears automatically.
Non-Breaking Hyphen
Although it looks like a normal hyphen, the Non-Breaking Hyphen doesn’t allow line breaks
(just like the No-Break Space). This makes sense for abbreviations like UV‑A and
UV‑B.
However, the Non-Breaking Hyphen is not available if your program doesn’t
support Unicode; it does not exist in ANSI. Additionally, it is not included in well-known
fonts like Times New Roman, Arial, Courier New, Garamond, Tahoma or Verdana (but some programs
can handle it anyway – just try it!).
The apostrophe
ANSI only contains one apostrophe (') which is a punctuation mark and thus allows line
breaking. In Unicode, on the other hand, there are several, and they have different properties.
The interesting one in this context is called “Spacing Modifier Letter Apostrophe” which is not
a punctuation mark, but a letter. Like the Non-Breaking Hyphen, it is not included in the
well-known fonts like Times New Roman, Arial, Courier New, Garamond, Tahoma or Verdana. If you
have a suitable font, you find the Spacing Modifier Letter Apostrophe in the Unicode block
“Spacing Modifier Letters” with the number U+700, or you can use the HTML entity
ʼ
.
My font Quivira of course contains all the characters mentioned here.
Usage
Office programs usually have a menu that allows you to insert special characters which are not
found on the keyboard (e.g. menu “Insert” → “Special character” or “Symbol”).
Generally, I recommend the
BabelMap
as a comfortable character table; but if you’re using Microsoft Windows, you can also use the
character table that comes with it.
Special issues in web projects
Particularly with regard to web projects using these possibilities would theoretically pay off
most, because you never know the font sizes and the window sizes of your clients.
Unfortunately there are some problems with the support of the mentioned characters:
- The three “normal” characters and the No-Break Space work in every browser. However, some browsers allow line breaks after a hyphen, and others don’t (according to the Unicode Standard they should allow line breaks!).
- The Soft Hyphen is supported correctly by Internet Explorer and Opera. The Gecko-based browsers (Firefox, Mozilla and Netscape) ignore it, which is no problem at all. Annoyingly, Konqueror does always show it as a hyphen (even within a line) – so you should better not use it. Update (19.04.2007): I recently tested the Soft Hyphen in a new browser called Swift – and it worked perfectly. Swift uses Apple’s Webkit Engine, so it should work in Safari, too. I also read in the German SELFHTML Forum, that this bug has also been fixed in Konqueror, so it should now be safe to use it since every modern browser either treats it correctly or at least doesn’t show an inept hyphen.
- The Non-Breaking Hyphen shouldn’t be used either since it is not included in the well-known fonts.
I.e., you should think about using these characters first, on the basis of the target audience. If you decide to use them, the BabelMap is a very useful tool for inserting them, because it does not only provide the characters themselves, but also the finished HTML entities.
Social Bookmarks