A wunderBAR Story

Or, what should be broken became solid, and what should be solid became broken.

While searching for something completely different, I came across a fascinating story of the “wunderBAR”. Since the story is very short, I’ll quote it here in full:

It seems that in the sixties, the intent was to have a vertical bar. But the keyboard standard was printed with a dirt on the camera-ready proof so that it was mass-reproduced as a broken bar. Then most manufacturer implementors did not read the book (…) and copied the drawings from the picture of the keyboard. And so we have this character for which nobody has a use, but many code tables contain it and compilers use its code point!

That’s a really interesting story! But… it doesn’t quite ring true. Then again, there could well be something to it, because no one quite seems to know what the broken bar is for.

In fact, the broken bar barely even exists anymore. In the days of DOS, the character used for the pipe symbol (on the DOS command line) or for logical OR (in C/C++, for example) used ASCII code 7Ch (124 decimal), which was rendered as a broken vertical bar by the fonts used at least by the IBM MDA, CGA, EGA, and VGA cards. But nowadays that is no longer the case. The same ASCII codepoint is rendered as a solid vertical bar in Windows 10 or Linux, and also shown as a solid vertical bar on contemporary keyboards. What happened?

The first problem with the wunderBAR story is that it talks about a “keyboard standard”. That sounds rather unlikely. But the ASCII standard was developed in the 1960s, and it did go through several revisions. The story could plausibly be about ASCII rather than some kind of keyboard standard.

Untangling the more recent history is easier. For example the ANSI X3.4-1986 standard very clearly describes the character 7/12 (column/row 7/12, or ASCII code 7Ch) as a “vertical line”, and shows it as an unbroken line or bar. There is no change in the recent reaffirmed editions of X3.4-1986. It is therefore clear that a character with ASCII code 7Ch should be displayed as an unbroken vertical line in accordance with current standards. But that does not answer the question why it ever was shown differently.

The ancient X3.4-1963 standard, which was primordial ASCII, didn’t even have a vertical bar character. The ACK control symbol occupied position 7/12 (code 7Ch).

Going back in history in the other direction, the ECMA-6 standard from August 1973 (4th edition) again clearly shows a plain “vertical line” at position 7/12 of the International Reference Version (mean to be aligned with ASCII). There is no sign of a broken vertical bar.

So where did it come from? Why was IBM using a broken bar in the 1980s when the standards of the time only showed an unbroken vertical line? Was it some IBM weirdness? As it turns out, not really…

The 1977 edition of ANSI X3.4 is publicly available thanks to Uncle Sam. Once again, the character in position 7/12 is an unbroken vertical line. But there is a curious note about changes in the 1977 revision:

D2.2 (3) Clarification of conflict between graphic shape and description (position 7/12).

ANSI X3.4-1977, published as FIPS 1-2

Hmm, what’s that about?

X3.4-1968

While I haven’t been able to find the X3.4-1968 standard online, a rather substantial chunk of it is publicly available thanks to RFC 20. And sure enough, at position 7/12, there is that broken bar! IBM didn’t make it up, but happened to adopt a standard that was current circa 1970, already superseded in many countries in the mid-1970s, and officially revised by ANSI in 1977. Inertia kept the broken bar going long after it was no longer the correct graphical representation of ASCII code 7Ch.

Does X3.4-1968 offer any clues then? The character at position 7/12 is described as “vertical line”, just like it is in all the later revisions. The only difference is that in X3.4-1968, the vertical line is broken, and in X3.4-1977 and later it’s not.

Since the broken bar is shown on two different pages of X3.4-1968, the story about dirt on the proof is not credible. The character was clearly meant to look like a broken bar. At least in the incomplete version of X3.4-1968 preserved in RFC 20, there is no explanation of why the bar is broken. But section 6.4 contains the following text:

Furthermore, this standard does not specify a type style for the printing or display of the various graphic characters. In specific applications, it may be desirable to employ distinctive styling of individual graphics to facilitate their use for specific purposes as, for example, to stylize the graphics in code positions 2/1 and 5/14 into those frequently associated with logical OR (|) and logical NOT (¬), respectively.

ANSI X3.4-1968, reprinted in RFC 20

It should be pointed out that the character used in the text for logical OR is clearly an unbroken vertical line. Code position 2/1 is normally the exclamation point (!). There may or may not have been more in the appendix of X3.4-1968.

X3.4-1967

But wait… X3.4-1968 was a minor revision of X3.4-1967. Perhaps the older revision might clarify the mystery?

Unfortunately, X3.4-1967 appears to be even harder to find than X3.4-1968. The closest thing I was able to locate is a November 1967 article in the Western Union Technical Review, archived thanks to Tom Jennings.

Once again, the article very clearly shows the ASCII character at position 7/12 as a broken vertical bar. Further evidence that it wasn’t any kind of misprint.

The article mentions an unpublished X3.4-1965 standard and describes the changes between the published X3.4-1963 and X3.4-1967 standards (the frequent revisions are a clear sign that things were very much in flux back then). One of the differences was the character at position 7/12; once again, it is described as a “vertical line” but shown as a broken vertical line.

Indeed the article further says:

The broken vertical line in position 7/12 will probably be widely used as either the “logical or” symbol, or to indicate “the absolute value of”.

revised U.S.A. standard code for information interchange by Fred W. Smith, Nov 1967

Now there can no longer be any question—the line was meant to be broken. Since Fred W. Smith was a member of the X3.2 subcommittee, the information should be rather authoritative.

Okay, so the 1967 and 1968 revisions of the X3.4 ASCII standard clearly did show a broken vertical line as ASCII character 7Ch, and it was unquestionably intentional. But why? It is possible that the appendices to X3.4-1967 and/or X3.4-1968 hold the answer… if I could find them.

Logical OR, Logical NOT

What I could find is a 1980 book titled Coded Character Sets, History and Development by Charles E. Mackenzie. The author was an IBMer and the book is fairly IBM-centric, but that’s not necessarily a bad thing. An entire chapter of the book, namely Chapter 24 called Logical OR, Logical NOT is devoted to the question of characters which could be used to represent logical OR and logical NOT operators in the ASCII character set.

On page 436, the book shows that the unpublished 1965 ASCII standard already had the vertical line in position 7/12 (unlike the 1963 version).

When the 1965 revision of ASCII was being worked on, various IBM user groups (e.g. SHARE and GUIDE) became concerned that sufficient provisions were not being made to accommodate the PL/I logical OR (|) and logical NOT operators (¬). But wait, you say! Didn’t we just establish that the vertical line (|) was already at position 7/12?

Well, yes and no. The trouble was that the vertical line was there in ASCII, but not everyone necessarily used ASCII. The ISO 646 standard reserved several characters, which included the 7/12 position, for national language support.

The PL/I fans requested that the logical OR and logical NOT characters be placed somewhere in the ASCII 20h-6Fh range, because it was expected that many devices, including printers, would not be able to handle characters outside of that range.

It was also expected that PL/I would eventually become an international standard (which it did), and then it would be very inconvenient if the programming language required characters not available in many countries (this goes to show that predicting the future is a tricky business).

The proponents of PL/I even became members of the X3.4 subcommittee and voted against the 1965 revision of the ASCII standard, but were outvoted. However, due to upcoming changes to ISO 646 , X3.4-1965 was never published, and instead X3.4 decided to wait and resolve differences with ISO 646 in the 1967 revision of ASCII.

The X3.4-1967 standard, just like X3.4-1968 quoted above, stated that manufacturers are allowed “to stylize the graphics in code positions 2/1 and 5/14 into those frequently associated with logical OR (|) and logical NOT (¬)”. This was meant to solve the problem with several regions, notably Germany and the Scandinavian countries, where the vertical bar (|) was in fact not available in the standardized 7-bit character set.

But this led to another potential problem. If a manufacturer decided to stylize the character in position 2/1 as a vertical line, and used the American or International Reference Version of the character set with a vertical line in position 7/12, there would be two visually indistinguishable characters. That was, unsurprisingly, considered extremely undesirable… and the solution was, as you can probably guess, to make the vertical line at position 7/12 look different by putting a break in the middle.

Further detail can be found here. A 1972 article in the Honeywell Computer Journal chronicles the X3.4 and ISO voting and describes the decision to break the vertical line as one made “in desperation”.

This turned to be a solution to a made-up problem. In the old days, scientific (think FORTRAN programmers) and data processing (think banks) customers actually used different character sets. This was obviously highly undesirable, and users demanded universal character sets. Anyone using a computer for text processing needed the exclamation point at position 2/1 and wasn’t going to stand for having it replaced with a vertical line. PL/I learned to use the vertical line (|), broken or not, for logical OR, and the caret (^) for logical NOT on ASCII systems.

The idea of replacing the exclamation point with a vertical line fell by the wayside by the mid-1970s. There was no longer any rationale for breaking the vertical line in position 7/12 to make it visually distinct. Hence X3.4-1977 shows the vertical line unbroken, as it was originally intended.

The story of the wunderBAR is very interesting, but decidedly made up. There was no dirt on a proof. The vertical line was broken intentionally, but for reasons that were arguably obscure even in the 1960s, and became irrelevant in the 1970s. Yet thanks to the legacy of the IBM PC, the broken bar survived for a surprisingly long time, and indeed became a distinct Unicode character, even though it has no real use.

Loose Ends: It would be great to find the complete text of X3.4-1967, X3.4-1968 and (unlikely) even X3.4-1965. Likewise the first three editions of ECMA-6 would be very useful.

Update: After writing the article, I was able to locate an exceedingly well hidden complete copy of X3.4-1968 and ECMA-6 3rd edition from 1970. It is hard to be certain without seeing all revisions, but it is quite likely that ECMA-6 in fact never used the broken vertical bar. And very likely neither did ISO 646. However, now that I can finally read it, Appendix A5.3 of X3.4-1968 does offer further confirmation that the broken vertical bar was entirely intentional:

The character vertical line is shown as it is in the code table to avoid confusion with the solid vertical bar frequently used as a logical operator, which may be found in some systems as a graphic stylization of exclamation point.

ANSI X3.4-1968, Appendix A5.3

Furthermore, there is fascinating correspondence between ECMA and ANSI from 1970 squirreled away inside a large PDF. It is apparent that ECMA did not much like how the vertical bar was broken in the ANSI X3.4 standard. The ECMA Secretary General complained:

Already the permission given in the ASCII standard to “design the shape” of Exclamation Mark so that it practically looks like a vertical line (to be used as OR) led to the design of the so-called “broken bar” in 7/12. What should be broken (a stroke and a point) has become solid and what should be solid (a full stroke) has become broken.

D. Hekimi, ECMA Secretary General, March 10, 1970

The X3.2 chairman explained the complex reasons why the broken vertical bar was created, and also offered these insights:

The Logical OR – exclamation point compromise is not based on ration; it was that or no ASCII! (You see, I do need an exclamation point worse than a Logical OR in correspondence).

The vertical bar in ASCII is only represented as “¦” so as to not confuse it with Logical OR but is not called a “broken” vertical bar! (twice now)

Eric H. Clamons, X3.2 Chairman, March 21, 1970

Reading (barely) between the lines, it is apparent that the X3.2 chairman did not like the broken vertical bar either, but given the choice between ASCII with a broken bar or no ASCII, decided to go for the broken bar. And there we have it.

This entry was posted in Computing History, Corrections. Bookmark the permalink.

21 Responses to A wunderBAR Story

  1. John Elliott says:

    I remember finding it odd as a youngster using a BBC Micro in the 1980s, that the bar symbol on the keyboard was the broken bar ¦ but the default screen mode used a Teletext character generator in which character 7/12 was an unbroken double bar ‖. Now I come to look it up I find that’s because it was using the UK localised version of the Teletext font.

  2. Michal Necasek says:

    What was the double bar meant to be? The single vertical bar has a clear use in mathematics and programming, but what did the double bar do?

  3. Millie says:

    It was effectively the same character. MODE 7 (the Teletext mode), showed characters differently because they were stored as their ASCII codes, whereas the other modes the characters were stored as bitmaps. Square and curly brackets were similar – in MODE 7 they showed up as 1/2, 3/4 and left+right arrows. In other modes they showed as what the keycaps said. Operationally this made no difference. Underscore and backslash were others that showed different in MODEs 0-6 and 7, but functionally they were the same.

    The bar character was mainly used in sending control codes for the programmable function keys. |M being the most used (to my knowledge) as it was a carriage return.

  4. kodabar says:

    In World System Teletext, there were local variations. The UK version had the double line character – on the Wikipedia page, local variant characters are shown in red in the example:
    https://en.wikipedia.org/wiki/World_System_Teletext
    In the English (as in England) variant set, it’s 7C.
    https://en.wikipedia.org/wiki/Teletext_character_set

    The BBC Micro was made for the BBC and it was to include a teletext/viewdata screen mode. And the original World System Teletext standard was based off the BBC’s UK standard. So the BBC Micro used it because it was in the UK teletext standard.

    The double vertical bar indicates parallelism in mathematics, but can also indicate norm in pairs. As it’s included in a group of numerical characters in the UK character set, it seems likely to be for those mathematical uses. Though I would like to know for sure. I remember originally thinking it was part of a semi-graphics box-drawing set (like DOS code page 437). But there’s a different set of characters for those in teletext.

  5. Richard Wells says:

    FORTRAN did not use any special character set. The 47 characters available to FORTRAN included the 36 letters and numbers plus 11* other symbols that should be available to nearly any keyboard. Comparisons that in other languages involve a special symbol are dot commands in FORTRAN; i.e. .LT. for less than or .OR. for logical OR. (I know that some of the later Fortran revisions included special symbols but those were adopted years after other languages.)

    * 3.1.4 Special Characters. A special character is one
    of the eleven characters: blank, equals, plus, minus,
    asterisk, slash, left parenthesis, right parenthesis, comma,
    decimal point, and currency symbol. X3.9 page 8 1966

    Just for a bit of whimsy, APL used the bar and caret differently. The bar was magnitude, more often referred to as absolute value. The caret was the logical AND with an inverted caret for the logical OR.

  6. Retron says:

    For us in the UK, it’s always been fun. The (Unicomp) Model M keyboard I’m using to type this has “¦” on the key which produces “|” and vice-versa. A quick glance at the Logitech keyboard in the PC in the other room shows “¦” on both the “¦” and “|” keys…

    A mess, frankly, but one which you quickly got used to back in the DOS days. These days people look at you with an odd look if you mention piping commands, but once upon a time it was commonplace…

  7. Michal Necasek says:

    Yes, the utter lack of lowercase letters reflects FORTRAN’s age. I believe 48-character sets were common in printers used in the 1950s and 1960s. I suppose there was just enough to print a bank statement… and no more.

  8. John Elliott says:

    I don’t know what the reasons were for including ‖ in UK teletext. It might have been more understandable if it had been a downward-pointing arrow, since unless I’ve missed something there are left, right and up arrows but not a down one.

  9. SweetLow says:

    >The same ASCII codepoint is rendered as a solid vertical bar in Windows 10
    Just use raster fonts in console setup. Most of the [old] variants (and my favorite 10 x 18) are broken.

  10. scruss says:

    Richard Wells wrote:

    # FORTRAN did not use any special character set

    That’s true once we got to the days of ASCII, but less so before that. In punched card times, there were IBM “Commercial” (BCD-A) and “Fortran” (BCD-H) character sets:

    COMM &-0123456789ABCDEFGHIJKLMNOPQR/STUVWXYZ #@ .¤ $* ,%
    FORT +-0123456789ABCDEFGHIJKLMNOPQR/STUVWXYZ =’ .) $* ,(

    The Commercial set was missing characters like plus and parentheses that Fortran needed.

    The above from Doug Jones’s punched card codes — https://homepage.divms.uiowa.edu/~jones/cards/codes.html

  11. Richard Wells says:

    FORTRAN did not need the special characters. It might have looked a bit weird but I have seen enough references that explained code printed on the wrong chain would have percent and lozenge instead of parentheses. Some concessions had to be made due to the tiny memory spaces of the early computers. I expect that many modern programmers would have a panic attack if faced with the 1400 character memory of the bottom rung 1401.

    That was all gone shortly after 1964’s introduction of EBCDIC which means it was very old code by the time I heard about it.

  12. Dung Saga says:

    In the quoted correspondence between ECMA and ANSI from 1970 (https://ia800800.us.archive.org/35/items/enf-ascii-1968-1970/Image070917151315.pdf#page=44&zoom=auto,-166,734), they mentioned LVL and PLVM a lot.
    It was confusing at first. But later they mentioned that
    – LVL is Long Vertical Line
    – PLVM is Preprinted Long Vertical Mark

  13. Richard Wells says:

    The portion that jumped out to me was that for long vertical lines “high speed printers may have difficulty printing it.” Exactly what problems were expected I don’t know; if the printer can handle the lower case “l,” the very similar vertical bar should be doable.

    Having a bar alongside multiple lines of print could be cause for needing a broken bar. The line printers had gaps between each line. A break in the bar would result in a continuous broken bar for the entire length which might look better than a longer bar which only has breaks between printed lines. This was still early on and even standard creators accepted that the limits of hardware overrode the letter of the standard.

    IBM had a lot of character sets which can be seen in the IBM 26 documentation. The IBM 701 scientific mainframe used character set G which included two plus and tw0 minus symbols. Sometime between 1952 (introduction of the 701) and 1963 (unchanged from the 1965 Bitsavers copy), character set H was introduced. I haven’t turned up evidence that character set H was created before FORTRAN.

  14. taimarost says:

    Ironic how | was modified to avoid confusion with “logical OR” (a |-shaped “!”) but is now widely used precisely as logical OR in many programming languages.

  15. Mike Gran says:

    Here’s some more trivia from the USA military side of things.

    MSCII (Military Standard Code for Information Exchange) Rev 07-13-1967, such as can be found in tables in MIL-STD-1280 (1969), was a version of ASCII (USAS X3.4-1967) tailored by the military for its use. Unlike ASCII, it had an unbroken vertical bar in position 7/12. So, it looks like the military never bought into the broken bar idea. Possibly because it wasn’t congruent with its Optical Character Recognition character set.

    The unbroken vertical bar appears in the USA Standard Character Set for Optical Character Recognition {USAS X3.17-1966}. In that standard, it is given the explicit purpose of being an “information separator” and is not intended to be logical OR. That standard lives on in the OCR-A font family that can still be found.

    In a note in MIL-STD-188C (1969), it says that “certain major procurements were made based on the pre-May 1966 version of USASCII”. It then claims that, in pre-May 1966 version of ASCII, the vertical line was at position 7/14 and apparently was unbroken. So, MIL-STD-188C is evidence that the broken bar first appeared in a published version in late 1966 or in 1967.

    (The character at position 7/12 was in pre-MAY 1966 USASCII “overscore”, according to comments in MIL-STD-188C.)

    MIL-STD-188C does include an ASCII table based on X3.4-1968 and it has the broken bar.

  16. Michal Necasek says:

    It was slightly more complicated. It was assumed that the (sometimes-broken, sometimes-not) vertical bar in position 7/12 could be used as a logical OR. But it was also assumed that in some countries, there might be some other character at position 7/12. And then the (possibly re-formed) exclamation mark would be used. The hated trigraphs in C were a different solution to much the same problem.

  17. Michal Necasek says:

    It makes sense that the US military would not be concerned about international variants of the character coding, so they just completely avoided the whole thing. (Although I can’t find any evidence that any non-US variant, like ECMA or ISO, ever used the broken bar either).

    I wonder about the vertical line at 7/14. The X3.4-1963 standard had no vertical bar at all, and at 7/14 there was the ESC control character. As far as I know X3.4-1965 was never published, and the next published version was X3.4-1967. But of course there were draft versions, and maybe that’s what they’re referring to.

    We know that the vertical bar was broken in X3.4-1967, and it’s highly likely that the “breaking” occurred sometime in 1965 or 1966.

  18. Richard Wells says:

    ECMA-94 has a broken vertical bar though that was quite late.

    I did find an MIT paper from 1968 with a demonstration of an early CRT terminal concept which included a broken vertical bar. No reason given for that but the narrow matrix means that the lower case “L” is indistinguishable from an unbroken vertical line. The standard propagated fast.

  19. Michal Necasek says:

    Yes, there is a broken vertical bar in ECMA-94 from 1985/1986. However, it is in position 10/6 (“broken bar”), and in position 7/12 there is a solid vertical bar (“vertical line”).

  20. Stephen Cole says:

    I still prefer to think of pipe as broken vertical bar (like in the MS-DOS days)
    and reserve solid vertical bar for logical OR (and double vertical for conditional OR)
    [I bet a lot of us mistakenly coded a single vertiacl bar when we really meant double one … 🙂 ]

    However have to live with solid vertiacl bar (on kbd and screen) as a pipe character in the modern world

  21. Richard Wells says:

    The Paul Allen King paper from 1968 shows some of the problems with early CRT displays. A lot of characters are hard to distinguish. The interlaced example has the broken bar that looks like a pair of tear drops which, at least, was easy to tell apart from the lower case “L.” Even today, many fonts are not optimized for easy discernment of characters.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.