August 20th, 2012 at 11:54 pm by Dr. Drang
I keep getting tripped up by a historical quirk in BBEdit.1 It’s something that regular BBEdit users probably don’t think twice about, but it catches me every time. I’m hoping that by writing about it here, I’ll train myself to think before typing.
You know about the line endings problem, right? Those of us who try to live our lives in text files like to talk about how portable and universal they are, but line endings are the dirty little secret we’re all aware of but don’t mention.
Two ASCII control characters are used for line endings:
- Carriage return (CR), which is ASCII character 13 in decimal, 0D in hex, and usually given as
\rin regular expressions.
- Line feed (LF), which is ASCII character 10 in decimal and 0A in hex, and usually given as
\nin regular expressions.
These control characters are a holdover from the Teletype days and represent two separate actions of the printer: returning the print head to the beginning of the line, and advancing the paper one line.
Text files on DOS and Windows computers, somewhat anachronistically, end every line with a CR+LF combination, as if they were still controlling a mechanical printer. Unix followed the lead of its predecessor Multics and used just the LF character. Apple, in both the Apple II line and the early Macintosh, also used just a single character to end lines, but it chose CR.
And therein lies BBEdit’s quirk. Under Mac OS, a regular expression search in BBEdit that looked for line endings would search for
\r. This made sense because that was the line ending for Mac-native text files. When the transition to OS X came, Bare Bones could have insisted that users search for
\n to find line endings in Unixy text files. It didn’t. Instead, regular expressions must still use
\r to find line endings, regardless of how the file is saved.
I assume Bare Bones decided to use a single regular expression for all line endings because it allowed its users to maintain their ingrained habits during the transition from Mac OS to OS X, and because it meant BBEdit AppleScripts with regex searches wouldn’t have to be rewritten. And if you use BBEdit to edit files from both Windows and OS X, you don’t have to think about the line endings when constructing a search—the same search will work on both file types.
Still, I have a 15-year habit of using
\n for line endings. That habit’s been reinforced not only by the text editors I’ve used, but also by the scripting languages I’ve programmed in. Too often, I’ve started filling the search field with something like this and been perplexed when it turned up no hits.
It would be easier to make the transition if I could always use
\r, but that’s not going to happen. Perl and Python both use
\n, and they’re not going to change. I’ll just have to become more flexible and context-sensitive.