Archive for the ‘programming’ Category
Words with many doubled letters
July 1st, 2008
In a footnote to my last post, I mentioned that typing “bookkeeper” over and over made me realize that it had three consecutive doubled letters, and I wondered if it was the only such word in English. Today I learned that the answer is basically “yes.”
Unix and Unix-like systems (Linux, OS X) have a builtin spellchecker that accesses a simple text-only dictionary at /usr/share/dict/words. This is a file with, on my machine, 234,936 words, one per line. Finding out how many have three consecutive doubled letters is a simple Perl one-liner:
perl -ne 'print if /(.)\1(.)\2(.)\3/' /usr/share/dict/words
The n option tells Perl to read the file one line at a time and apply the program to each line in turn. The e option tells it that the program is on the command line, not in a file. The program itself—between the single quotes—prints the line if it matches the regular expression between the slashes. The regular expression looks for
(.) any character (captured)
\1 the first captured expression
(.) any character (captured)
\2 the second captured expression
(.) any character (captured)
\3 the third captured expression
which is regular expression-speak for three consecutive doubled characters. The result is
bookkeeper
bookkeeping
subbookkeeper
So “bookkeeper” and other forms of it are the only words in the dictionary with three consecutive doubled letters. I find “subbookeeper” rather suspect; it seems like a word coined specifically to have four consecutive doubled letters. It can be found on the Internet, but I’m not buying it.
The one-liner can be generalized to find all the words with three doubled letters, regardless of whether they’re consecutive.
perl -ne 'print if /(.)\1.*(.)\2.*(.)\3/' /usr/share/dict/words
This gives a list of 170 words. Some, like “whippoorwill,” “successfully,” and “committee,” are pretty common. Most, like “unpossessedness” and “buttressless,” are weird constructions with “-ness” or “-less” suffixes that only a lawyer or bureaucrat would use.
There are four words with four doubled letters. The one-liner
perl -ne 'print if /(.)\1.*(.)\2.*(.)\3.*(.)\4/' /usr/share/dictwords
yields
killeekillee
possessionlessness
subbookkeeper
successlessness
I love “successlessness” because it defines whoever would use it.
My no-server personal wiki—Part 3
June 26th, 2008
This the last post describing the self-contained wiki-like system I use to keep track of project notes at work. The first post in the series explained my motivation for creating this system. The second post described how I use it. In this post, I’ll show the behind-the-scenes programming that puts it together.
Makefile
Let’s start with the Makefile. I mentioned before that the HTML pages are generated by running the make utility in the notes directory. Here’s the Makefile.
1: # Makefile for project notes.
2:
3: mdfiles := $(wildcard *.md)
4: htmlfiles := $(patsubst %.md, %.html, $(mdfiles))
5:
6: all: notesList.js $(htmlfiles)
7:
8: notesList.js::
9: python buildNotesList.py > notesList.js
10:
11: %.html: %.md project.info header.tmpl footer.tmpl
12: python buildPage.py $* > $@
13:
14: clean:
15: rm $(htmlfiles) notesList.js
Lines 2 and 3 use wildcards and pattern substitutions to create variables that define all the Markdown-formatted (.md) content files and the corresponding HTML pages.
Line 6 defines the all rule; because it’s the first rule in the Makefile, it’s also the default rule, so executing make is the same as running make all. It builds the JavaScript file notesList.js and the HTML pages.
Lines 8 and 9 define the rules for building the notesList.js file. It’s built by running the Python program buildNotesList.py, which we’ll get to in a bit. As we’ll see, notesList.js defines the sidebar links to all the HTML pages in the notes folder. Since new notes can be added at any time, notesList.js must be rebuilt whenever make is run.
Lines 11 and 12 define the rule for building the HTML notes pages. A page gets (re)built whenever
- its corresponding
.mdfile is created; - its corresponding
.mdfile is updated; or - whenever the header or footer template file is updated.
The page is built by running another Python program, buildPage.py, taking the corresponding .md file as input.
Lines 14 and 15 define a cleanup rule that deletes the HTML files and notesList.js. This rule is executed by running make clean from the Terminal. It’s sort of a defensive rule; if things get really screwed up, make clean will take me back to a pristine state. All the deleted files can be regenerated by running make.
notesList.js and buildNotesList.py
As mentioned above, notesList.js is a JavaScript file that’s used to generate the list of links to other notes pages that appears at the top of the sidebar (see the screenshot of a page in Part 2). It defines a JavaScript function, showNotesList, that writes a series of list items with links to the notes pages. In the skeleton version of the notes folder described in Part 2, there is only one HTML notes page so notesList.js is very simple:
function showNotesList(){
document.write('<li><a href="aa-overview.html">Overview</a></li>')
}
The notesList.js file is generated by the Python program, ‘buildNotesList.py`:
1: #!/usr/bin/python
2:
3: import os
4:
5: # Get the titles of all the notes files in the directory. The
6: # title is assumed to be the first line of the file. Truncate
7: # the title at a word boundary if it's longer than maxlength.
8: # Print out a JavaScript function that will write an HTML list
9: # of the notes files.
10:
11: fileLI = []
12: maxlength = 35
13: allFiles = os.listdir('.')
14: baseNames = [ f[:-3] for f in allFiles if f[-3:] == '.md' ]
15: for fn in baseNames:
16: f = file(fn + '.md')
17: top = f.readline()
18: title = top.strip('# \n')
19: if len(title) > maxlength:
20: words = title.split()
21: twords = []
22: count = 0
23: for w in words:
24: if count + len(w) > maxlength:
25: break
26: else:
27: twords.append(w)
28: count += len(w) + 1
29: title = ' '.join(twords) + "…"
30: fileLI.append('<li><a href="%s.html">%s</a></li>' % (fn,title))
31: f.close()
32:
33: print '''function showNotesList(){
34: document.write('%s')
35: }''' % ' '.join(fileLI)
I think the comment at the top of the file describes it pretty well. The trickiest part of the program is getting the title of the link. The title of the page is the first line of the .md file, with any leading or trailing hash marks (#) deleted. But since a page title could be pretty long and the sidebar is rather narrow, I wanted the link titles to be truncated to the nearest word boundary short of 35 characters. That’s what Lines 19-29 do, putting an ellipsis (…, HTML entity …) at the end to indicate the truncation.
buildPage.py
This is the real workhorse of the system.
1: #!/usr/bin/env python
2:
3: import sys
4: import os
5: import os.path
6: import time
7: import string
8: import urllib
9:
10: # The argument is the basename of the Markdown source file.
11: mdFile = sys.argv[1] + '.md'
12:
13: # Open the page files and process the content.
14: header = open('header.tmpl', 'r')
15: footer = open('footer.tmpl', 'r')
16: cmd = 'MultiMarkdown %s | SmartyPants' % mdFile
17: content = os.popen(cmd, 'r')
18:
19: # Make the template.
20: templateParts = [header.read(), content.read(), footer.read()]
21: template = string.Template(''.join(templateParts))
22:
23: # Close the page files.
24: header.close()
25: footer.close()
26: content.close()
27:
28: # Initialize the dictionary of dynamic information.
29: info = {}
30:
31: # Dictionary entry with long modification date of the Markdown file.
32: mdModTime = time.localtime(os.path.getmtime(mdFile))
33: info['modldate'] = time.strftime('%B %e, %Y', mdModTime)
34: info['modldate'] = info['modldate'].replace(' ', ' ')
35:
36: # Dictionary entry with short modification date of the Markdown file.
37: info['modsdate'] = time.strftime('%m/%e/%y', mdModTime)
38: info['modsdate'] = info['modsdate'].replace(' ', '')
39:
40: # Dictionary entry with modification time of the Markdown file.
41: info['modtime'] = time.strftime('%l:%M %p', mdModTime)
42: if info['modtime'][0] == ' ':
43: info['modtime'] = info['modtime'][1:]
44:
45: # Dictionary entry with absolute path to the Markdown file (for editing).
46: info['mdpath'] = os.path.abspath(mdFile)
47:
48: # Add project info to the dictionary.
49: projInfo = open('project.info', 'r')
50: for line in projInfo:
51: if line[0] == '#' or line.strip() == '':
52: continue
53: name, value = [s.strip() for s in line.split('=', 1)]
54: if name in info:
55: info[name] += '\n' + value
56: else:
57: info[name] = value
58:
59: projInfo.close()
60:
61: # Dictionary entry with absolute path to project info file (for editing).
62: info['infopath'] = os.path.abspath('project.info')
63:
64: # Convert the contacts into a series of HTML list items.
65: if 'contact' in info:
66: contactLI = []
67: cl = [s.split(':',1) for s in info['contact'].split('\n')]
68: for c in cl:
69: if len(c) == 1:
70: contactLI.append('<li>%s</li>' % c[0])
71: else:
72: contactLI.append('<li><a href="addressbook://%s">%s</a></li>'\
73: % tuple(reversed(c)))
74: info['contactlist'] = '\n'.join(contactLI)
75: else:
76: info['contactlist'] = ''
77:
78: # Output the template with the dynamic information substituted in.
79: print template.safe_substitute(info)
It does basically five things:
- It processes the given Markdown file through MultiMarkdown (Fletcher Penney’s extended version of Markdown that includes support for tables and other niceties) and SmartyPants (John Gruber’s typographical conversion program that substitutes curly quotes for straight quotes and m- and n- dashes for multiple hyphen sequences). See Lines 16 and 17.
- It concatenates the header template, the just-generated main content, and the footer template into a new string template that will later be turned into the HTML page file. See Lines 20 and 21.
- It queries the file system for info about the source Markdown (
.md) file and creates a set of dictionary entries with that information. See Lines 31-46. - It goes through the
project.infofile, and creates another set of dictionary entries with the data it reads from that file. See Lines 48-76. - It creates the HTML page by substituting the dictionary entries from Steps 3 and 4 into the string template from Step 2.
header.tmpl and footer.tmpl
The header template file looks like this:
1: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
2: "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
3: <html>
4: <head>
5: <title>$projname ($projnumber)</title>
6: <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
7: <link rel="stylesheet" type="text/css" media="all" href="notes.css" />
8: <link rel="stylesheet" type="text/css" media="print" href="notes-print.css" />
9: <!-- <script type="text/javascript" src="file:///Users/drang/Library/JavaScript/jsMath/easy/load.js"></script> -->
10: <script type="text/javascript" src="styleLineNumbers.js"></script>
11: <script type="text/javascript" src="notesList.js"></script
12: </head>
13: <body onload="styleLN()">
14: <div id="container">
15: <div id="title">
16: <h1 class="left">$projname</h1>
17: <h1 class="right">$projnumber</h1>
18: </div> <!-- title -->
19: <div id="sidebar">
20: <h1>Project notes:</h1>
21: <ul>
22: <script type="text/javascript">showNotesList()</script>
23: </ul>
24: <hr />
25: <h1>Contacts:</h1>
26: <ul>
27: $contactlist
28: </ul>
29: <hr />
30: <h1>Source:</h1>
31: <ul>
32: <li><a href="txmt://open?url=file://$mdpath">Edit in TextMate</a></li>
33: <li>Last modified<br />
34: $modldate<br />
35: at $modtime</li>
36: </ul>
37: <hr />
38: <ul>
39: <li><a href="txmt://open?url=file://$infopath">Edit project info</a></li>
40: </ul>
41: </div> <!-- sidebar -->
42:
43: <div id="note">
Although it’s called header.tmpl, you’ll see that it really contains both the header and the sidebar.
Line 9 in the <head> section contains a call to a JavaScript file that isn’t in the notes folder. This is one of the files that comes with the jsMath library, a set of JavaScript and PNG files created by Davide Cervone that allow equations to be embedded in the pages without the need for MathML support. Since most people don’t need equations in their notes, I’ve commented this line out. My project notes often do need equations, so I usually have this line uncommented and it brings in jsMath library from its spot in my ~/Library/JavaScript folder.
The footer template looks like this:
1: <hr />
2: <p class="info">
3: Source: <a href="txmt://open?url=file://$mdpath">$mdpath</a><br />
4: Last modified: $modldate at $modtime<br />
5: <!-- This page built: $buildtime -->
6: </p>
7: </div> <!-- note -->
8: </div> <!-- container -->
9: </body>
10: </html>
It adds a little notation at the bottom of the page, telling where the source file is and when it was last updated.
The structure of the resulting HTML page is pretty simple:
<div id="container">
<div id="title">
blahblahblah
</div> <!-- title -->
<div id="sidebar">
blahblahblah
</div> <!-- sidebar -->
<div id="note">
blahblahblah
</div> <!-- note -->
</div> <!-- container -->
CSS
I’m not going to go through the CSS files because they’re long and not that interesting. suffice it to say that notes.css floats the sidebar to the right and defines a set of colors, type sizes, and spacing that I find pleasing. Not surprisingly, it’s quite similar to the layout of this blog. The CSS file for printing, notes-print.css, hides the sidebar because navigation links don’t work on paper and turns all the colors to black and white because that’s the kind of printer I use.
styleLineNumbers.js
If a notes file contains source code with line numbers, the JavaScript functions in this file will style it nicely and allow me to toggle the line numbers on and off. It’s the same set of functions I use on this blog and which I’ve described in an earlier post.
All together now
If you’re interested in playing around with this system, I’ve made a zip file of my skeleton notes folder available to download. Have fun, and let me know of any improvements you make.
My no-server personal wiki—Part 2
June 25th, 2008
In my last post, I gave my reasons for needing a wiki for project notes, what I wanted it to do, and why I decided to roll my own instead of going with any of the readily available free or commercial offerings. In this post, I’ll describe how my system works. If some of the choices I’ve made seem odd to you, the earlier post should explain the whys and wherefores.
The files for the wiki—both the content and the build files—sit in a folder called “notes.” This makes the system portable. I keep a skeleton version of this folder on my Desktop, and whenever I need a notes wiki for a new project, I drag a copy from the Desktop into the project folder. Here’s a view of the skeleton.

The content is kept in the Markdown-formatted files with an .md extension. The skeleton provides just one of these, aa-overview.md, which is filled with mostly nonsense. Any file added to the folder with an .md extension will be taken as the content for a new page. The wiki pages themselves are the .html files with the same basenames as the .md files. Everything else in the folder is either a programming, template, or configuration file.
The HTML version of aa-overview is intended to be the home page of the wiki. For the skeleton file, it looks like this:

The skeleton aa-overview.md, as with all the Markdown files, contains just the content of the main section of the page:
# Overview #
## Background ##
This is where I describe the project.
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
The header, the sidebar, and the footer are all built from the template and configuration files. The page is built statically, so aa-overview.html is a full HTML file, not a fragment.
The top section of the sidebar contains links to all the notes files in the folder. The links are created automatically by the build script, which looks through the folder for all the .md files. These links will be in the sidebar of every page, so I can go directly from any page in the wiki to any other. This would make for a very cluttered sidebar if my projects had hundreds of notes pages, but I’ve never needed more than a dozen or so.
Each page’s name is taken from top line of its .md file. The page links are sorted according to the file name, which is why the overview page is given the “aa-” prefix—I want it at the top of the list. I use this method of sorting to group similar pages in the list of links; inspection notes have an “ii-” prefix, lab notes have an “ll-” prefix, etc.
The project name and number in the header and the contact link in the sidebar come from the project.info file:
projname = Project Name
projnumber = 9999
contact = Apple:A1A2AA41-FA30-40AE-9925-FD6DB270B0A5%3AABPerson
This file defines variables that are used by the build script to create the pages. The projname and projnumber are obvious; the contact variable contains the name of the contact (“Apple” for the skeleton file) and its Address Book ID, separated by a colon. I described a script for extracting the ID earlier this month—this is where I use it. When you click on a contact link in the sidebar, the Address Book opens to the card for that entry.

This is convenient when I’m reviewing a project and want call or email my client.
Although the skeleton version of the project.info file has just one contact, a page can have any number of contact links. I just add more contact = lines to the file.
I can, of course, edit a page by going to the Finder, working my way to the appropriate notes folder, and opening the .md file in my text editor. But I’ve made it easier than that. The editor I use, TextMate, is the handler for special URLs of the form
txmt://open?url=file:///Users/drang/Desktop/notes/aa-overview.md
By including a link like that in the sidebar, I can just click on it and TextMate will open the .md file associated with the page. The bottom of the sidebar has a similar link to the project.info file.
I mentioned before that the HTML files in the notes folder are built statically, which means that, unlike a regular wiki, I have to do something to get them built. And what I need to do is run the make utility on the Makefile in the notes directory. This could be done through the Terminal, but I’ve made a simple TextMate command that runs make whenever I type Command-Option-Control-M.

There’s probably a clever way to use Folder Actions to run the Makefile whenever an .md file is added or changed, but I haven’t looked into that yet.
In summary, when I want to create a new wiki for project notes, I
- Drag the skeleton
notesfolder from my Desktop to the project folder. - Edit the
project.infoandaa-overview.mdfiles to include the appropriate information. - Add new
.mdfiles as my work on the project grows.
In my next post in this series, I’ll describe the template files and build scripts and will provide a zipfile with my skeleton notes folder.
Update
Part 3 is now up.
My no-server personal wiki—Part 1
June 24th, 2008
This is the first of what I expect to be a two- or three-part post on a wiki-like note taking system I’ve developed for keeping track of the work I do on my professional projects. I’ll start by explaining what I want out of such a system and how other solutions didn’t fit my needs.
At any given time, I’m working on several projects, each with its own client and its own set of facts, figures, and research results. My habit has always been to keep all computer files related to a particular project in its own folder (usually with subfolders) on my hard disk, just as I keep all physical files related to the project in a set of manila folders that are labeled and kept together in my file cabinets. Keeping everything related to a project together is important for three reasons:
- It’s just easier to keep track of things this way. Despite the improvements in Spotlight, dragging a file into a folder is less time-consuming than tagging it with a project name.
- Sometimes I need to copy all my work on a project and send it to the client. My paper files go off to a copy shop for photocopying, and my computer files get burned to DVD. If all the computer files are in one folder, burning a DVD is a one-step operation.
- When the project is done, I archive the computer files to DVD, put them with the paper files, and send them off to storage. Again, if all the computer files are in one folder, burning a DVD is a one-step operation.
Until recently, most of my project notes were on paper rather than in computer files. This was primarily because most of my notes are taken in the field, away from my computer, and there’s been no organizational advantage to rewriting those notes on the computer. But I did want the notes on the computer, so I began looking for ways to organize my notes that way.
At first, a wiki seemed like a natural fit. It’s easy to create new pages, and the pages are easy to navigate and edit. If I choose the right wiki software, I can use Markdown formatting, which I use to write almost everything these days. Unfortunately, the usual server-based wikis keep all the information in a central database, which means that my project notes would not be kept in the project folder. This makes it too difficult to archive the notes with the other project files, so server-based wikis were out.
The notion of a file-based wiki led me to TiddlyWiki and its various offshoots. Because it’s run with JavaScript, TiddlyWiki would let me keep my notes for a particular project in a single big HTML file in the project folder, just as I want. But there were two problems with TiddlyWiki: First, I didn’t like the default style and found its CSS structure very difficult to delve into and modify. Second, TiddlyWiki doesn’t work well with Safari because Safari’s JavaScript doesn’t like the idea of saving changes to the HTML file, which is essential to the idea of a wiki. (I tried using the Java applet that gets around this problem, but it didn’t work for me. I suspect I could have gotten it to work if I’d kept at it, but since I didn’t like the look of TiddlyWiki it didn’t seem worth the effort.)
So then I moved on to VoodooPad, Gus Mueller’s personal wiki application for the Mac. This seemed like the perfect solution. Its pages can be styled however I like, its data are saved to a file that can be put anywhere on my hard disk and can be exported to various open formats—this is important because I don’t want to get stuck in a proprietary system—and it’s backed by a developer known to be responsive to his customers. I tried it, I bought a license for it, but I just couldn’t get used to using it because it doesn’t use Markdown. I was a bit surprised at this reaction, but after a decade of using text-only systems (SGML, LaTeX, and now Markdown) I just couldn’t stand using something that works like a word processor. Eventually, I gave up and gave my copy of VoodooPad to my daughter to help her organize her college notes. Since it came from me, I suspect she’s ignored it—most parents of teenagers will understand that—but I still think VoodooPad would be a great solution for someone who likes writing in a word processor.
At this point, I decided to create my own system. The rules I set for myself were:
- It has to use text files that can be stored anywhere and are easily moved and archived.
- The notes are to be written in Markdown. If I later decide that Textile or reStructuredText or something else is better, changing to the new markup system should be easy.
- The notes are to be written in a text editor rather than in an HTML text box so I don’t feel like I’m writing in a straightjacket.
- The notes should be styled according to my taste. Since my taste can change, the style should be easy to change.
- The creation of new pages and new links to those pages should be simple and/or automatic.
I’ll start describing the system I came up with in the next post.
Update
Part 2 and Part 3 are now up.
Address Book URLs, revisited
June 5th, 2008
In an earlier post I discussed the Mac’s addressbook URI scheme and how you can open a particular contact in your Address Book with a command like
open addressbook://A1A2AA41-FA30-40AE-9925-FD6DB270B0A5:ABPerson
from the Terminal. Everything after the double slash is the ID of the contact, accessible via the id property in AppleScript. In a similar way, you can create links in HTML documents which, when clicked, will open Address Book to the contact with that ID. (On my work computer, the above ID is for Apple, Inc.; the ID for Apple on your computer will be different.) I’ve been using such links a lot recently—for a serverless wiki-like system I’ll be writing about soon—and needed a utility for quickly extracting the ID of an Address Book contact.
The Address Book opener described in that earlier post used a pair of scripts—one AppleScript and one bash script. I never liked that setup, because the AppleScript was somewhat convoluted and because it used two script to do basically one thing. So even though the ID extraction script would be almost identical to the Address Book opener, I didn’t want to reuse my old code.
The script works like this: entering
abid john smith
at the Terminal prints the ID of the first contact with the names “john” and “smith.” Any number of arguments can be given to the abid command; the name of the contact must have each of the arguments as a substring. The search through the Address Book is case-insensitive. If no match is found, abid prints “No match!” instead of the ID.
Here’s the code. It’s written in Python using the appscript module.
1: #!/usr/bin/python
2:
3: import sys
4: from appscript import *
5:
6: # Find the contacts that have all the names in the given list.
7: def searchName(nameList):
8: names = [ x.lower() for x in nameList ]
9: # Get everyone whose last name matches.
10: matches = [ x for x in app('Address Book').people.get() if names[-1] in x.name.get().lower() ]
11: # Look for matches with the other names if there are any.
12: while len(names) > 1:
13: del names[-1]
14: matches = [ x for x in matches if names[-1] in x.name.get().lower() ]
15: # Return the list of matches.
16: return matches
17:
18: # Print the ID of the top match. Or print an error message.
19: try:
20: print searchName(sys.argv[1:])[0].id.get().replace(':', '%3A')
21: except IndexError:
22: print "No matches!"
The searchName function does almost the same thing as the identically-named function in my old AppleScript, but is much shorter and the logic is cleaner. Which is why I wanted to get away from AppleScript. I suppose the really cool, Lispy way to write searchName would be to make it recursive instead of iterative, but the while loop works fine.
The name of a contact is the full name—prefix, first, middle, last, suffix—as a string. If the contact is a company, the name is the company name. The list comprehensions in Lines 10 and 14 are basically substring filters.
You can see in Line 20 that I’ve URL-encoded the colon near the end of the ID, replacing it with %3A. The encoding isn’t necessary to make the URL legal, but eliminating that colon from the string makes the ID easier to use in the serverless wiki I mentioned at the top of this post.
The program works through the argument list from back to front instead of front to back. I expected to give it the names in “first last” order, and I thought that filtering by last name first would be faster because identical last names are rarer than identical first names and the list comprehension in the while loop would therefore be working with smaller lists. This notion could be completely off base; I haven’t done any benchmarking.
There’s no need to pass abid more than one name if one of the contact’s names is unique. I have only one Adolf in my Address Book, so
abid adolf
is all I need to get his ID. Nicknames will work if the nickname is a subset of the name given in Address Book. For example, if someone is listed as Robert Johnson, then
abid rob johnson
will find him, but
abid bob johnson
will not.
This script would likely be unnecessary if Address Book would show us the ID and let us copy it into the clipboard. But I haven’t found any way to get Address Book to show us the ID; if you know of a way, I’d like to hear about it.
Simple AppleScript tuneup for iTunes
June 2nd, 2008
My iTunes library has many songs that I ripped back in my Linux days, when I got track information from freeDB, a service that was started up when Gracenote changed the licence on the previously-free CDDB database. As with Gracenote—but to a greater extent than with Gracenote—some of freeDB’s entries are a little funky.
One set of entries that’s been bothering me for years is Creedence Clearwater Revival’s Chronicle album of greatest hits. The name field for every track on that album is prefixed with “CCR - ”. Thus,
CCR - Susie Q
CCR - I Put A Spell On You
CCR - Proud Mary
and so on. Eventually I got tired of seeing this and wrote the following AppleScript to delete the six-character prefix.
1: tell application "iTunes"
2: set sel to selection
3: repeat with theTrack in sel
4: set theName to name of theTrack
5: set len to length of theName
6: set name of theTrack to (characters 7 thru len of theName as text)
7: end repeat
8: end tell
It probably took a bit longer to write the script than it would have to change the 20 or so tracks by hand, but it was more interesting. As you can see from Line 2, the script assumes that all the tracks that need to be changed are selected before the script is run.
The first time through, I thought Lines 5 and 6 could be the single line
set name of theTrack to (characters 7 thru last of theName as text)
but apparently the “last” specifier can’t be used that way. Hence the “len” variable in Line 5 to get the index of the last character. (Had I written this in Python and appscript, I could have done that with a single line because Python has a simple way of indexing the end of the string without knowing the string length. But by the time I realized “last” wouldn’t work the way I wanted, I didn’t feel like rewriting in Python.)
The “as text” stuck on the end of Line 6 is needed because without it “characters 7 thru len of theName” will return a list of characters instead of a string.
This is very much in the Unix tradition of one-shot, throwaway scripts. It solves a specific problem by automating a tedious process and will never be used again.
Rerating iTunes tracks on the fly
May 14th, 2008
My usual iTunes playlist for office listening is a smart playlist that contains tracks with a rating of 3 stars or more that I haven’t listened to in at least a week. When I add new tracks to the library, I always give them 3 stars to start out so they get into that playlist. Some just don’t cut the mustard and need to be downgraded. Since the best time to downgrade a song is while I’m listening to it, and since I usually have iTunes hidden while I work, I wrote an AppleScript to do the downgrade and used FastScripts to bind it to the easily-remembered keystroke Control-Option-Command-downarrow. (It is easy to remember. First, almost all of my customized keystrokes use Control-Option-Command because I’ve never seen it used in an application and its easy to mash down those three adjacent keys simultaneously. And the downarrow is pretty obvious, isn’t it?)
Here’s the script:
1: tell application "iTunes"
2: set downTrack to current track
3: next track
4: set oldRating to rating of downTrack
5: set newRating to oldRating - 20
6: set rating of downTrack to newRating
7: end tell
It first makes a variable to hold the track I want to downgrade and skips to the next track in the currently-playing playlist. Then it goes to work downrating the subpar track by one star (which is equivalent to 20 points in AppleScript terms).
Why does it skip to the next track before downrating? Recall that my smart playlist contains only tracks with 3 or more stars. If I downrate a playing track to 2 stars, it’s immediately removed from the playlist, there’s no selected track, and iTunes stops playing. Skipping to the next song before downrating ensures that the music continues.
And no, of course I didn’t think of that before I wrote the script. I endured several music interruptions before I realized what was going on and how to prevent it.
Tuning up an old script
May 13th, 2008
Back in December I described an AppleScript that toggled either PandoraBoy or iTunes between play and pause. I used FastScripts to map it to the F13 key on my iMac at work, so I had just one button to press to stop the music when I had to take a call and restart it when the call was over.
Recently I’ve noticed a problem with the script when iTunes is the active music player. Whenever I attach my iPod or iPhone, the iTunes browser window shifts from the playlist I’m listening to and displays the summary information for the connected device. This shift happens even if iTunes is hidden, which it usually is. Now if I hit F13 to pause the music, the music stops; but when I hit F13 again, the music doesn’t restart because there are no songs in the iPod/iPhone summary view. To restart the music, I have to unhide iTunes, navigate to the playlist I was in, and click the Play button—which sort of defeats the purpose of having a single Play/Pause button.
I didn’t notice this problem when I first started using my script, because my habit then was to connect my iPod nano when I got to the office and not disconnect it until I left. Now I usually disconnect the nano sometime after it’s synced, and connect and disconnect the iPhone a couple of times during the day to make sure the phone’s calendar events and contacts list are up to date.
My solution was to add a couple of lines to the MultiPlayPause script. The lines are
set playingList to current playlist
set view of front window to playingList
and they are added in the two locations in the script where iTunes gets paused. The effect is to force the iTunes browser to show the playlist I’m listening to, even if I’ve connected my iPod/iPhone.
The full MultiPlayPause script now looks like this:
1: -- find out what's running
2: tell application "System Events"
3: set numBoy to count (every process whose name is "PandoraBoy")
4: set numTunes to count (every process whose name is "iTunes")
5: end tell
6:
7: (*
8: This is the logic of the following:
9: * if whatever is playing, pause it
10: * if both are running but neither are playing, start playing PandoraBoy
11: * if only one is running, flip its play/pause state
12: * if neither are running, do nothing
13: *)
14:
15: if numBoy > 0 then
16: if numTunes > 0 then -- both PandoraBoy and iTunes are running
17: tell application "iTunes"
18: if player state is playing then -- iTunes is playing
19: set playingList to current playlist
20: set view of front window to playingList
21: pause
22: tell application "PandoraBoy"
23: if player state is playing then -- both PandoraBoy and iTunes are playing
24: playpause
25: end if
26: end tell
27: else -- iTunes is not playing
28: tell application "PandoraBoy" to playpause
29: end if
30: end tell
31: else -- PandoraBoy is running but iTunes is not
32: tell application "PandoraBoy" to playpause
33: end if
34: else if numTunes > 0 then -- iTunes is running but PandoraBoy is not
35: tell application "iTunes"
36: set playingList to current playlist
37: set view of front window to playingList
38: playpause
39: end tell
40: end if
I’m pretty sure this script still has some holes, but it seems to be working well so far.
Snapshot/upload utility with GUI
May 9th, 2008
This post describes a short utility that streamlines my workflow in taking screenshots and uploading them to my server. It has a simple GUI and can create both a full-sized and reduced-sized image in a single step.
Longtime readers of this blog will note that I wrote a very similar utility a couple of years ago and decribed it in this post. In those days, I would invoke the script through Quicksilver, using QS’s Run… action to pass parameters—e.g., the file name—to the script. It worked well, but three things pushed me to write a new version:
- QS’s Run… action stopped working consistently. It would often disappear entirely from my iBook, leaving me no way to use the script.
- I switched from Quicksilver to LaunchBar, in large part because of QS’s quirks. (Here is more complete rundown of my reasons for switching.) But because Launchbar doesn’t allow parameters to be passed to scripts, the older script couldn’t be called from it.
- I wanted a graphical user interface so I didn’t have to remember the order of the options.
I call the new program snapftp, and I invoke it through FastScripts. I’ve given it a keyboard shortcut of Control-Option-Command-4, with is similar to the Apple-standard Command-Shift-4 for snapshots of a portion of the screen. The program launches and presents this dialog box.

The filename for the snapshot is entered without the “.png” extension. Three types of snapshot are possible:
- A single capture file in the original size.
- A single capture file, resized according to the width parameter.
- Two capture files, one original and one resized.
By default, the capture files are saved to my Desktop and uploaded to the “images” directory on my server. The checkbox option at the bottom of the window can preclude the upload.
When I click on the Snap button (or press the Return key) the dialog box goes away, and the computer acts very much like it does when I press Command-Shift-4. The biggest difference is that whereas Command-Shift-4 starts in rectangle capture mode, snapftp starts in window capture mode. I prefer starting in window capture mode because I usually want a snapshot of a window. If I need to, I can change to rectangular capture mode by pressing the Space bar.
After the snapshot is made, the capture file appears on my Desktop and, unless told otherwise, is uploaded to my server via FTP. The URL of the uploaded file is put in the Clipboard for later pasting.
If the “Both” option was chosen, there are two capture files. The full-sized capture file has the name I gave it and the resized file has that name with a “-t” appended to it before the “.png” extension (the “t” is for “thumb” even though the resizing may produce an image much bigger than what is normally considered a thumbnail). For example, if I choose “snap” as the file name, “snap.png” will be the full-sized capture file and “snap-t.png” will be the resized capture file. Only the URL to the uploaded version of “snap.png” is put on the Clipboard.
There are several choices for adding a Mac-native graphical user interface to a script. The methods I considered are:
- The
display dialogcommand in AppleScript. The information available from this is limited to a line of text or a single button click, which is too little for snapftp. - CocoaDialog. This has more options than “display dialog” and easily called from any scripting language, but still collects only one piece of information from the user.
- Pashua. This has text input, radio buttons, checkboxes, popup menus, etc. and can collect several pieces of information from the user in one fell swoop. Its downside is that it requires an explicit textual description in the script to position the dialog elements.
- On My Command. Although the primary use of this utility is to create contextual menu items (which I have done), it can also create applications that make full use of nib files created by XTools’ Interface Builder.
- Platypus. Although this is supposed to provide a simple way to “Mac-ize” command-line scripts, it doesn’t seem to have any user interface hooks. Perhaps it’s meant only for dropping files onto an app.
Because it provided everything I needed for snapftp and didn’t force me to learn Interface Builder (which has always seemed exotic and scary to me), I chose Pashua. There was some trial and error involved in positioning the various parts of the GUI, but it wasn’t too painful.
Here’s the snapftp source code:
1: #!/usr/bin/python
2:
3: import Pashua
4: import sys, os, shutil
5: from subprocess import *
6: from ftplib import *
7:
8: # FTP and local parameters
9: host = "leancrew.com"
10: baseurl = "http://www.leancrew.com/all-this/images"
11: extension = ".png"
12: user = "drdrang"
13: passwd = "itzaseekret"
14: ftpdir = "public_html/all-this/images"
15: localdir = os.environ['HOME'] + "/Desktop"
16:
17: # Dialog box configuration
18: conf = '''
19: # Window properties
20: *.title = Snapshot FTP
21:
22: # File name text field properties
23: fn.type = textfield
24: fn.default = snap
25: fn.width = 264
26: fn.x = 94
27: fn.y = 130
28: fnl.type = text
29: fnl.default = File name:
30: fnl.x = 20
31: fnl.y = 132
32:
33: # Radio button group properties
34: rb.type = radiobutton
35: rb.option = Original
36: rb.option = Resized
37: rb.option = Both
38: rb.default = Original
39: rb.x = 94
40: rb.y = 52
41: rbl.type = text
42: rbl.default = Capture:
43: rbl.x = 30
44: rbl.y = 92
45:
46: # Resized width text field properties
47: rw.type = textfield
48: rw.default = 400
49: rw.height = 22
50: rw.width = 60
51: rw.x = 263
52: rw.y = 71
53: rwl.type = text
54: rwl.default = width:
55: rwl.width = 50
56: rwl.x = 215
57: rwl.y = 73
58:
59: # Local files checkbox properties
60: lf.type = checkbox
61: lf.label = Local files only
62: lf.x = 32
63: lf.y = 5
64:
65: # Default button
66: db.type = defaultbutton
67: db.label = Snap
68:
69: # Cancel button
70: cb.type = cancelbutton
71: '''
72:
73: # Open the dialog box and get the input.
74: dialog = Pashua.run(conf)
75: if dialog['cb'] == '1':
76: sys.exit()
77:
78: # Go to the localdir.
79: os.chdir(localdir)
80:
81: # Set the filenames and url.
82: fn = '%s.png' % dialog['fn']
83: fnt = '%s-t.png' % dialog['fn']
84: url = '%s/%s' % (baseurl, fn)
85:
86: # Capture a portion of the screen and save it to the file.
87: Popen(["screencapture", "-iW", fn], stdout=PIPE).communicate()
88:
89: # Resize the file if asked to
90: if dialog['rb'] == 'Resized':
91: Popen(['sips', '--resampleWidth', dialog['rw'], fn],
92: stdout=PIPE).communicate()
93: elif dialog['rb'] == 'Both':
94: shutil.copy(fn, fnt)
95: Popen(['sips', '--resampleWidth', dialog['rw'], fnt],
96: stdout=PIPE).communicate()
97:
98: # Upload the file unless told not to.
99: if dialog['lf'] != '1':
100: ftp = FTP(host, user, passwd)
101: ftp.cwd(ftpdir)
102: ftp.storbinary("STOR %s" % fn, open(fn, "rb"))
103: if dialog['rb'] == 'Both':
104: ftp.storbinary("STOR %s" % fnt, open(fnt, "rb"))
105: ftp.quit()
106:
107: # Put the URL of the uploaded file onto the clipboard.
108: Popen('pbcopy', stdin=PIPE).communicate('%s/%s' % (baseurl, fn))
My original snapshot utility was written in Perl, but I wrote snapftp in Python. Although Python is my current first choice for a scripting language, I would have stuck with Perl—there would have been much less rewriting—except that I found that Python is curiously faster at launching Pashua than Perl is when invoked from FastScripts. I started with a Perl program that just brought up the dialog box, thinking I would add the logic from the old program later. Although this Perl skeleton was fast on my Intel iMac, it took more than 10 seconds to launch Pashua and bring up the dialog box on my iBook G4, which was far too long. I then switched the skeleton program to Python and found that the dialog box came up almost instantly. I have no idea why.
Lines 9-15 provide the FTP and local parameters needed to ensure that the capture files end up where I want them. If you want your own version of snapftp, you’ll have to change these lines.
Lines 18-71 contain the Pashua configuration for the dialog box. One oddity of Pashua is that it wants the element positions given by their left and bottom coordinates rather than left and top. I got goofy looking results until that sunk in.
After getting the information from the user and putting it into the dialog dictionary (Lines 74-76), snapftp starts the built-in screencapture command-line program. The i option puts it in interactive mode, and the W option starts the interaction in window capture mode. As I said earlier, the user can switch to rectangular capture mode by pressing the Space bar.
If the user asked for a resized version of the capture, Lines 90-96 do the resizing with the built-in sips command-line program. The resampleWidth option sets the width of the resized image but keeps the aspect ratio constant. I’ve set this up to resize on the basis of width because I want to make sure the images fit within the content (white) area of the blog, which is sized according the the reader’s default font size. The default width of 400 pixels should make the image fit even for visitors who use a small default font size.
The last section of the program, Lines 99-108, uploads the image(s) to my server via FTP and puts the URL on the Clipboard using the pbcopy command. Of course, this happens only if the “Local files only” checkbox remains unchecked.
Both sips and screencapture are invoked through Python’s relatively new subprocess module. It’s supposed to replace the popen module, which I had just gotten used to. I still prefer Perl’s backticks. And while I’m complaining (mildly), let me say that I also prefer Perl’s FTP library to Python’s. Perl’s way of handling calls to an outside service is to mimic as much as possible the syntax of the service; Python’s way is to make the call using a Pythonesque syntax. While I understand the Python choice, I think it’s an impediment when—as is usually the case—the programmer knows the syntax of the outside service and knows what he would do if he were using that service directly.
I have snapftp saved in my ~/Library/Scripts folder where FastScripts can get at it. The Pashua application is saved in /Applications, as expected, and the Pashua.py Python module is saved in /Library/Python/2.5/site-packages. The module has one line that I feel is unwarranted and that I have commented out. Down near the bottom of Pashua.py, is a loop that looks like this:
for Line in Result:
print Line
Parm, Value = Line.split('=')
ResultDict[Parm] = Value.rstrip()
I think a print command in a library utility that is supposed to read lines is just plain wrong, so I’ve commented out the print Line.
Overall, I’ve found snapftp to be much nicer and easier to use than my old Quicksilver-based solution (even when it worked). The addition of the GUI has not made it slower, and it is, of course, easier to remember the options when they’re laid out in front of you. This may inspire me to add a GUI gloss to some of my other scripts.
Formatting flight schedules for trip card
May 7th, 2008
In an earlier post, I described an OmniOutliner template for collecting itinerary information and printing it out on an index card. In this post, I’ll show a simple script that takes flight information from an airline web site and reformats it for the trip card.
One of the tedious parts of making an itinerary is putting the flight information into the right format. The format I like is reflected in the template, which has the departure time first, the arrival time second, and the flight number (or numbers, if it’s a connection) third. The times are right justified and the flight number is left justified.
Unfortunately, the airline web sites have their own ideas about the best way to present flight information, and because their ideas and mine don’t agree, I can’t just copy schedules from their web sites and paste them into my template. I can, however, write a script that does takes the information in the airline’s format and rearranges it into my format. Since I usually fly Southwest, that’s the format I’ve focused on.
Southwest likes to arrange its flight information in tabular form. Each row of the table represents a flight, with the flight numbers first, then its departure and arrival times, then a bunch of other information that I don’t particularly care about right now. What I want to be able to do is select some portion of this table from Southwest’s page

paste it into my OmniOutliner trip card template (using the Paste With Current Style command to avoid overwriting the template’s carefully chosen tab settings)

and then run the script that reformats the data.

Here’s the script. I call it “Reformat SW Schedule” and keep it in ~/Library/Scripts/Applications/OmniOutliner. By putting it there, FastScripts—which I use instead of the usual AppleScript menu—will make it available at the top of its menu when OmniOutliner is the active application.
1: #!/usr/bin/python
2:
3: from appscript import *
4: import re
5:
6: # Get the contents of the selected cell in the front document. It should
7: # contain the schedule as it comes from Southwest's site.
8: ooFront = app('OmniOutliner').document.get()[0]
9: selectedCell = ooFront.selected_rows[0].topic
10: swSchedule = selectedCell.get()
11:
12: # Reformat the Southwest schedule to match the trip card format.
13: flightInfo = re.compile('^(\S+)\s+(\S+)\s+(\S+).*$', re.MULTILINE)
14: def format(mo):
15: infoList = ['\t',
16: mo.group(2)[:-1], # departure time, w/o the "m"
17: '\t',
18: mo.group(3)[:-1], # arrival time, w/o the "m"
19: '\t',
20: 'SW ',
21: mo.group(1)] # flight number(s)
22: return ''.join(infoList)
23: sched = flightInfo.sub(format, swSchedule)
24:
25: # Put the reformatted schedule back into the selected cell.
26: selectedCell.set(sched)
The script expects the outline cell with the badly-formatted schedule to be selected. It takes the contents of that cell, reformats it, and replaces the contents with the reformatted version. It will work with one or more rows of information from the Southwest table.
The script uses the appscript Python library to communicate with OmniOutliner. I suppose I could have used AppleScript instead of Python, but this script is mostly text manipulation and I prefer Python’s text handling tools. As with every AppleScript or appscript program, the hardest part is learning the application’s unique set of scripting commands. That’s taken care of in Lines 8 and 9. Line 10 then puts the contents of the cell into a variable, Lines 13-23 transform it, and Line 26 puts the transformed version back into the cell.
If I start using another airline regularly, I’ll probably write a similar reformatter for it.



