Many happy returns

I keep getting tripped up by a historical quirk in BBEdit.1 It’s something that regular BBEdit users probably don’t think twice about, but it catches me every time. I’m hoping that by writing about it here, I’ll train myself to think before typing.

You know about the line endings problem, right? Those of us who try to live our lives in text files like to talk about how portable and universal they are, but line endings are the dirty little secret we’re all aware of but don’t mention.

Two ASCII control characters are used for line endings:

• Carriage return (CR), which is ASCII character 13 in decimal, 0D in hex, and usually given as \r in regular expressions.
• Line feed (LF), which is ASCII character 10 in decimal and 0A in hex, and usually given as \n in regular expressions.

These control characters are a holdover from the Teletype days and represent two separate actions of the printer: returning the print head to the beginning of the line, and advancing the paper one line.

Text files on DOS and Windows computers, somewhat anachronistically, end every line with a CR+LF combination, as if they were still controlling a mechanical printer. Unix followed the lead of its predecessor Multics and used just the LF character. Apple, in both the Apple II line and the early Macintosh, also used just a single character to end lines, but it chose CR.

And therein lies BBEdit’s quirk. Under Mac OS, a regular expression search in BBEdit that looked for line endings would search for \r. This made sense because that was the line ending for Mac-native text files. When the transition to OS X came, Bare Bones could have insisted that users search for \n to find line endings in Unixy text files. It didn’t. Instead, regular expressions must still use \r to find line endings, regardless of how the file is saved.

I assume Bare Bones decided to use a single regular expression for all line endings because it allowed its users to maintain their ingrained habits during the transition from Mac OS to OS X, and because it meant BBEdit AppleScripts with regex searches wouldn’t have to be rewritten. And if you use BBEdit to edit files from both Windows and OS X, you don’t have to think about the line endings when constructing a search—the same search will work on both file types.

Still, I have a 15-year habit of using \n for line endings. That habit’s been reinforced not only by the text editors I’ve used, but also by the scripting languages I’ve programmed in. Too often, I’ve started filling the search field with something like this and been perplexed when it turned up no hits.

It would be easier to make the transition if I could always use \r, but that’s not going to happen. Perl and Python both use \n, and they’re not going to change. I’ll just have to become more flexible and context-sensitive.

1. Despite a brief interest in Sublime Text, I’m now pretty sure BBEdit is going to be my replacement for TextMate. I’m not ready to make a Marco Arment-style announcement, but I’m starting to feel comfortable with it.

7 Responses to “Many happy returns”

1. I have tried to use BBEdit in the past, but I am a deeply superficial person, and I can never get passed the fact that it looks like a twenty year old program, which it of course is. Sublime Text I expected to hate because it’s not a native app, but actually they’ve done a pretty good job of faking it. Services even work now!

2. Clark says:

Carl, it looks much better now. I admit that was my #1 complaint about BBEdit vs. TextMate. Which seems a bit silly in hindsight. If you hide the toolbar though it really does look quite contemporary.

3. Clark says:

BTW - I posted a question about this on the mailing list. I just don’t grep explicit line endings enough I guess. When I need them I’m more apt to use the special characters in grep.

4. Holy crap. I’ve been using BBEdit for around ten years, yet I don’t think I’ve ever consciously thought about this weirdness. In fact, I just did a test search for “\n” in an open document to prove you wrong, and was shocked to find no matches.

Given BBEdit’s notoriety for hidden preferences (configurable by the ‘defaults’ command-line tool, etc.), I’m surprised there isn’t one to enable standard behaviour of \n. (Or is there…?)

5. Clark says:

As I said I asked today. They said it’s planned for the future but that it’s low level through much of the legacy code.

Personally I’d think the easiest way to do it is just run a filter across the grep text before passing it to the grep engine. That should be trivial since you’re just replacing ‘\n’ with ‘\r’ internally.

6. Regarding CR and LF, Patrick Woolsey of Bare Bones said essentially the same thing to me in a support email as he said to Clark in the forum. The use of CR in BBEdit is indeed deeper than my post suggests, which has implications when you need AppleScripts and “shell” scripts (which could be written in Perl, Python, Ruby, etc.) to interact. I’ll have more to say about this in a later post.

And I agree that BBEdit looks as modern as any editor when you hide that chunky toolbar. The only aesthetic concern I have is the lack of italic and (especially) bold styles in the syntax highlighting rules. Don’t understand why they aren’t allowed.

7. Clark says:

I think they’re an artifact of the display engine they wrote. So long as you don’t have bold or italic letters take up more space than a regular letter I don’t see the problem.

Of course with the retinal MBP they are having to finish rewriting the display engine earlier than they had wished. So maybe it’ll be easy to add this in.