Unix text tools to the rescue

Yesterday at work, I wrote a report summarizing a bunch of tests I’ve been running for the past month or so. There were several tables of results, including one that listed, among other things, all of the parts and the temperatures at which they were tested. The table had a few dozen rows and five or six columns, but the pertinent portion of it looked something like this, with the rows sorted by part number:

| Part no. | Temperature |
|:--------:|:-----------:|
|   1105   |     150     |
|   1308   |     150     |
|   1847   |      RT     |
|   2505   |      RT     |
|   2876   |     150     |
|   3289   |      RT     |
|   3527   |     150     |
|   4010   |     200     |
|   4087   |      RT     |
|   4689   |      RT     |
|   4871   |     200     |
|   5067   |     150     |
|   5886   |      RT     |
|   6544   |     150     |
|   6957   |     150     |
|   7174   |      RT     |
|   8035   |     200     |
|   8294   |     150     |
|   8322   |      RT     |
|   8434   |     150     |

As I’ve mentioned before, I write my reports in (Multi)Markdown, which I then convert to a PDF that I can print or, more commonly nowadays, email to the client.

I didn’t send the report to the client yesterday, because I wanted to let it “rest” overnight and reread it with fresh eyes this morning.1 Sure enough, when I looked at it this morning, I decided it needed a new paragraph in which I listed all the parts that were tested at room temperature, all those tested at 150°, and all those tested at 200°.

There are lots of ways to pull those part numbers out of the table. For example:

  1. I could go through the table line by line and copy all the “RT” part numbers and paste them into a list, and then repeat that for “150” and “200.” This is both tedious and subject to error.
  2. I could transform the table into tab-separated values, paste that into a spreadsheet, sort by temperature, and then copy out the rows of part numbers corresponding to each temperature. The hardest part of this is the transformation into tab-separated values, which I’d probably do with a regular expression.
  3. I could do it pretty much all at once by copying the table to the clipboard and piping it through an awk one-liner.

I chose door number 3.

Here’s one-liner:

pbpaste | awk 'BEGIN {ORS=", "} / RT / {print $2}'

which gives an output of

1847, 2505, 3289, 4087, 4689, 5886, 7174, 8322, 

Now it’s true that I needed to delete the trailing comma and stick an “and” before the last item, but I’d have to do that with the other two methods, too. Once I had the room temperature part numbers, I repeated the command with “150” in place of “RT” and then with “200” in place of “RT.”

The pbpaste command is a OS X-specific utility that sends the contents of the clipboard to standard input output.2 The awk script, between the single quotes, has two parts:

Honestly, even if the spreadsheet method were faster, I’d still prefer to do this on the command line. It’s just more fun.


  1. It should be clear from the typos and missing/extra words that I don’t do this with blog posts. 

  2. Thanks to Vitor Galvão for catching this. I got ahead of myself—it’s the pipe that turns it into standard input. 


Hard SF

While on spring break last week, I listened to the Incomparable episode in which the Book Club discussed Andy Weir’s The Martian. I haven’t read it myself (I’m currently 25th in line for it at my local library), but I did want to talk about the panel’s reaction to it, which surprised me.

This particular edition of the Book Club comprised Lex Friedman, John Moltz, Lisa Schmeiser,1 Scott McNulty, and Jason Snell. They liked it but tempered their praise by gently disapproving of the technical detail he includes at the expense of character development. This struck me as odd coming from fans of science fiction, a genre in which technical detail has traditionally been more important than character development or emotional depth. Fans of hard sf are typically willing to accept characters of limited dimension as long as the technical bits are interesting and the plot moves along.

I’m currently rereading The Foundation Trilogy for the first time since high school, and I’m struck by how Asimov is able to get away with long stretches of exposition, sometimes coming in dialog, but often simply in the voice of the omniscient third-person narrator. The whole thing is one long violation of the “show, don’t tell” rule, and yet I keep turning the pages. I like to think of myself as a little more discerning than my 16-year-old self, but the cardboard characters just don’t bother me because I enjoy watching Asimov play out the trick that drives each story. (I think his robot stories have better tricks, but I haven’t read them since high school, either. Maybe I’d change my mind if I read them again.)

What the Book Club’s discussion most reminded me of was a passage from Heinlein’s Expanded Universe, a collection of older short fiction and nonfiction tied together with new introductions and afterwards that came out in 1980.

Expanded Universe

When I got back home, I pulled my old Ace paperback up from the basement and went searching for the passage I remembered. It concerns a set of orbital calculations Heinlein and his wife did while he was writing Space Cadet.

I was telling [a visitor] about a time I needed a synergistic orbit from Earth to a 24-hour station; I told him what story it was in, he was familiar with the scene, mentioned having read the book in grammar school.

This orbit is similar in appearance to cometary interplanet transfer but is in fact a series of compromises in order to arrive in step with the space station; elapsed time is an unsmooth integral not to be found Hudson’s Manual but it can be solved by the methods used on Siacci empiricals for atmosphere ballistics: numerical integration.

I’m married to a woman who knows more math, history, and languages than I do. This should teach me humility (and sometimes does, for a few minutes). Her brain is a great help to me professionally. I was telling this young scientist how we obtained yards of butcher paper, then each of us worked three days, independently, solved the problem and checked each other—then the answer disappeared into one line of one paragraph (SPACE CADET) but the effort had been worthwhile as it controlled what I could do dramatically in that sequence.

That’s the sort of dedication to technical detail that I expect in hard science fiction stories, although the key is the “one line of one paragraph” part. I suppose if Weir does the equivalent of including all of Bob and Ginny’s butcher paper in The Martian, he’s going beyond the limits of indulgence for even the most tolerant sf fan. I’ll find out when the 24 people in front of me are done reading it.

By the way, if you don’t understand Heinlein’s reference to “Hudson’s Manual,” he’s referring to The Engineers’ Manual by Ralph Hudson, a 300-page digest of facts, figures, and formulas that covers a broad range of engineering disciplines. It’s out of print now, but I still have and use the copy my dad gave me when I went off to college.

Hudson's Manual

Not much plot, no character development at all, but I do love this book.


  1. Whose Penny Wiseacre blog has become a favorite. I started with her well-linked Black Hand post and have stuck around. 


Afghanistan, March 2014

You’ve no doubt already heard that there were no US military deaths in Afghanistan in March, the first time that’s happened since January of 2007. Unfortunately, the figures for February were revised upward, so the totals are still higher than they were when I posted them last month.

Afghanistan, March 2014

I visited the LBJ Presidential Library a few days ago. It was odd seeing exhibits of a Democratic president whose great domestic achievements were overshadowed by his mistaken escalation of a war. That won’t happen to Obama, of course, because the nation has long since stopped caring about Afghanistan.


Letterman

I suppose David Letterman is to most of you what Johnny Carson was to me: the talk show host who’s always been there. But for those of us of a certain age, Letterman will always be the new thing, the breath of fresh air, the guy who changed television.

You can find antecedents for everything he did, just as you can with the Beatles. But as with the Beatles, it’s what Letterman did with his influences that made his work magical. If you weren’t there at the time, it’s hard to explain what made Late Night so different. Bits like Stupid Pet/Human Tricks, Viewer Mail, The Fugitive Guy, the Velcro/Alka-Seltzer/Sponge/Magnet/Potato Chip Suit, and Larry “Bud” Melman probably don’t seem revolutionary to you because your whole life has been spent watching Dave and his many imitators rework and refine the ironic viewpoint that led to them.

I was in college when Late Night debuted, so it shouldn’t surprise you how taken I was with it. That’s the time of life when irony is king. In the early days, Letterman was hugely popular on campuses and with recent graduates. Unfortunately, that popularity didn’t stop the NBC affiliate that served Champaign-Urbana from replacing Late Night with the execrable Thicke of the Night for one season. To this day, I hate Alan Thicke because of that lost year.1

Much of what was good about those early years of Late Night can be traced to the wonderful Merrill Markoe. There were great fully produced pieces, of course, but she understood better than anyone how to put Dave in an unexpected or uncomfortable situation and let him struggle for our entertainment.

It’s hard to get a sense of it from a single clip like this, but Dave’s interviews in those early years were famously bad. Much is made these days of Carson’s smooth ability to put his guests at ease. Dave was precisely the opposite; he seemed to enjoy nothing more than seeing things go off the rails, and you can see that in any number of clips on YouTube.

It’s funny how certain things stay with you. In 1984, Bob Dylan appeared on the show (playing, if I remember correctly, a Stratocaster borrowed from Keith Richards). It was a Thursday, so there was a Viewer Mail segment after the monologue. One of the letters was from a guy named Eric Anderson. When Paul Shaffer heard the name, he butted in. “Eric Anderson? Wow, it’s a big night for The Bitter End.” The joke died because no one in the audience understood the reference.2 But Dave did, and he delighted in how Paul’s joke fell flat. “Too hip for the room, Paul.”

I tell that story not merely to let you know that I was hip enough to get it, but also to lead into one last topic: how great the music on the show has always been. Obviously, most of the credit for that goes to Paul and the rest of the band, but you could tell Dave always loved having outside musicians on the show. Apart from Cher and Madonna—both of whom were there as “stars” rather than singers—I can’t remember him having a uncomfortable, Oliver Reed-like interview with a musician or a singer.

Some of my favorite episodes were when someone like Warren Zevon, Todd Rundgren, or Peter Frampton would just sit in with the band. You can’t, however, get the flavor of those shows from a short clip, so I’ll leave you with an unusually animated Dylan from the aforementioned show.

It looks like Dylan also borrowed some bandannas from Keith Richards and attached them to his lead guitarist.


  1. I’d probably hate Robin Thicke for it, too, just by association, if there weren’t plenty of independent reasons to hate Robin Thicke. 

  2. You probably don’t either, but don’t feel bad. The reference was 20 years old then, and it’s 50 years old now.