# An iOS 8 regression

Driving along, listening to a podcast. My iPhone is plugged in to top off its battery. The message alert sounds, interrupting John Siracusa’s complaint about Apple still selling 16 GB iPads and iPhones.

“Hey, Siri.”

“Da-ding?”

“Read me the last text message.”

“There are no messages.”

Yes there are! You just made the new message sound! What the hell is wrong with you? You used to know how to do this!”

# ASCIIfying

I’ve been adding more automation to my static blog publishing workflow.1 The scripts themselves are of no use to anyone else, but some bits and pieces may be of wider interest. For example, this morning I wrote a script using a library that converts Unicode strings to their nearest ASCII equivalent.

The script, written to be used as a Text Filter in BBEdit, automates the generation of header lines in the Markdown source code of a post. The header of this post, for example, looks like this:

Title: ASCIIfying
Keywords: python, programming
Date: 2014-10-19 22:10:00
Slug: asciifying


I write the Title and Keyword lines as I start the post, using a simple BBEdit Clipping. But before I publish, I need the other lines. The Date is easy to generate using the datetime library. That’s also the library I use to generate the year and month portions of the Link URL. The tricky thing is automating the creation of the Slug, which also shows up in the Link.

Oh, it’s very easy to make a slug when the title is as simple as this one, but suppose we started with this:

Title: Çingleton/Montréal isn't done
Keywords: test


Non-ASCII characters are allowed in URLs, but they can be troublesome, and I prefer to avoid them. Also, we can’t have the slash in there, and the apostrophe ought to go, too. Finally, I don’t want any spaces, because they cause nothing but trouble in the file system, and I hate seeing %20 in a URL.

The function I settled on is this:

python:
1:  def slugify(u):
2:    "Convert Unicode string into blog slug."
3:    u = re.sub(u'[–—/:;,.]', '-', u)  # replace separating punctuation
4:    a = unidecode(u).lower()          # best ASCII substitutions, lowercased
5:    a = re.sub(r'[^a-z0-9 -]', '', a) # delete any other characters
6:    a = a.replace(' ', '-')           # spaces to hyphens
7:    a = re.sub(r'-+', '-', a)         # condense repeated hyphens
8:    return a


All of the lines are straightforward and obvious except the unidecode call in Line 4. That is the one function exported by the unidecode library, and it does the substitutions that make slugify generate strings that are much more useful than anything I could write with the standard encode and decode methods. My script turns that two-line header above into

Title: Çingleton/Montréal isn't done
Keywords: test
Date: 2014-10-19 21:31:22
Slug: cingleton-montreal-isnt-done


which has a perfectly readable URL that includes nothing but lowercase ASCII characters, numerals, and hyphens.

The unidecode library is a Python port of a Perl module, and its documentation is sparse. If you want to know what it does and why it does it, go to Sean Burke’s writeup of his original Perl module, Text::Unidecode. It lays out his goals for the module, explains its limitations, and includes little gems like this:

I discourage you from being yet another German who emails me, trying to impel me to consider a typographical nicety of German to be more important than all other languages.

If you ever need to ASCIIfy some text, Text::Unidecode or one of its ports (here’s one for Ruby) will come in handy.

1. “Static blog publishing workflow” may be the most jargon-filled four-word phrase I’ve ever written.

# Circling the drain with Drafts

A Mac-only solution isn’t very satisfying anymore, so last night’s post on using Services to create ⓒⓘⓡⓒⓛⓔⓓ, pǝddılɟ, and s̸t̸r̸u̸c̸k̸ ̸t̸h̸o̸u̸g̸h̸ text felt incomplete. Combining the logic of the Python conversion scripts in that post with what I learned about using JavaScript in Drafts 4, I made three new keyboard scripts to convert selected characters in Drafts.

Each script works pretty much as you’d expect. Type in some normal text, select the portion you want converted, and tap the appropriate button. Voila!

The script that does the encircling is this:

javascript:
1:  function encircle(s) {
2:    var pchars = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"
3:      , cchars = "ⓐⓑⓒⓓⓔⓕⓖⓗⓘⓙⓚⓛⓜⓝⓞⓟⓠⓡⓢⓣⓤⓥⓦⓧⓨⓩⒶⒷⒸⒹⒺⒻⒼⒽⒾⒿⓀⓁⓂⓃⓄⓅⓆⓇⓈⓉⓊⓋⓌⓍⓎⓏ⓪①②③④⑤⑥⑦⑧⑨"
4:      , count  = pchars.length
5:      , regex  = new RegExp('.', 'g')
6:      , trans  = {}
7:      , lookup = function(c) { return trans[c] || c; };
8:
9:    for (var i=0; i<count; i++) {
10:      trans[pchars[i]] = cchars[i];
11:    }
12:
13:    return s.replace(regex, lookup);
14:  }
15:
16:  setSelectedText(encircle(getSelectedText()))


The structure of the encircle function is pretty much stolen wholesale from one of the answers to this Stack Overflow question. It takes advantage of a feature of JavaScript familiar to old Perl programmers: the replace method can take a function as the replacement argument. That function, lookup, is defined in Line 7 and the trans object it uses to replace characters is built in Lines 9–11. The regular expression in Line 5 that serves as the first argument to replace is very simple because all the work is done by lookup. Line 16 gets the selected text from the draft, converts it, and replaces the selection with the converted text.

Update 10/18/14
People who think I know what I’m doing with this stuff are so misguided. Nathan Grigg, an actual programmer who’s actually smart, pointed out on Twitter that the regular expression I was initially using was far more complicated than it needed to be. He was, of course, absolutely correct, so I’ve simplified it in both this script and the character flipping script below. The descriptions have been edited to match the new code. Thanks, Nathan!

The key that applies this script is built by tapping the pencil icon at the end of Drafts’ custom key strip, tapping the + button to add a new key, choosing the Script option, and then entering the label, name, and script. When you’re done, a new key with that label will appear in the strip.1

The flipping script is built the same way.

javascript:
1:  function flip(s) {
2:    var pchars = "abcdefghijklmnopqrstuvwxyz,.?!'(){}[]"
3:      , fchars = "ɐqɔpǝɟƃɥıɾʞlɯuodbɹsʇnʌʍxʎz'˙¿¡,)(}{]["
4:      , count  = pchars.length
5:      , regex  = new RegExp('.', 'g')
6:      , trans  = {}
7:      , t      = s.toLowerCase()
8:      , lookup = function(c) { return trans[c] || c; };
9:
10:    for (var i=0; i<count; i++) {
11:      trans[pchars[i]] = fchars[i];
12:    }
13:    var a = t.split("");
14:    a.reverse();
15:    return a.join("").replace(regex, lookup);
16:  }
17:
18:  setSelectedText(flip(getSelectedText()))


Apart from the two strings that define the conversion, there are a two other differences between this script and the encircler:

1. Because there aren’t good upside-down versions of all the capital letters, everything is converted to lowercase first. That’s done in Line 7.
2. To look decent in the flipped condition, the order of the letters has to be reversed. That’s done by creating an array of characters in Line 13, reversing it in Line 14, and then joining them back together in Line 15.

Finally, there’s the strikethrough script.

javascript:
1:  function strikeout(unstruck) {
2:    var s = String.fromCharCode(824)
3:      , a = unstruck.split('');
4:    if (a.length > 0) {
5:      return a.join(s) + s;
6:    }
7:    else {
8:      return '';
9:    }
10:  }
11:
12:  setSelectedText(strikeout(getSelectedText()))


This one doesn’t do any replacing, because there aren’t “stuck out” versions of all the letters. Instead, it puts the COMBINING LONG SOLIDUS OVERLAY character (decimal 824, hex 0338) after every character in the string. The combination appears as a character with a diagonal line through it.

It should be easy enough to use these scripts as guidelines for creating other conversions. sᴍᴀʟʟ ᴄᴀᴘs, for example, seems like something Than Tibbetts should be all over.

Because these key definitions seem fairly simple and robust, I’ve uploaded them to the Keyboard Extensions section of the Drafts Actions Directory:

You can install them from there if you don’t want to go through the character-building2 exercise of making them yourself.

Update 10/19/14
Here’s a clever idea from Jamie Jenkins on Twitter. Put the circled characters at the end of the pchars string and the plain characters at the end of the cchars string, like this:

var pchars = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789ⓐⓑⓒⓓⓔⓕⓖⓗⓘⓙⓚⓛⓜⓝⓞⓟⓠⓡⓢⓣⓤⓥⓦⓧⓨⓩⒶⒷⒸⒹⒺⒻⒼⒽⒾⒿⓀⓁⓂⓃⓄⓅⓆⓇⓈⓉⓊⓋⓌⓍⓎⓏ⓪①②③④⑤⑥⑦⑧⑨"
, cchars = "ⓐⓑⓒⓓⓔⓕⓖⓗⓘⓙⓚⓛⓜⓝⓞⓟⓠⓡⓢⓣⓤⓥⓦⓧⓨⓩⒶⒷⒸⒹⒺⒻⒼⒽⒾⒿⓀⓁⓂⓃⓄⓅⓆⓇⓈⓉⓊⓋⓌⓍⓎⓏ⓪①②③④⑤⑥⑦⑧⑨abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"


The rest of the script remains the same. With this change, you can toggle between circled and plain using the same key command. Same thing can be done with the pchars and fchars variables in the flip function.

1. Yes, one screenshot shows the label as Ⓐ, and the other shows it as ⓐ. I changed the name between screenshots and didn’t feel like going back and redoing the earlier one.

2. I have no shame.

# Circle service

Greg Scown of Smile Software wrote a blog post today in which he described a TextExpander group for writing ⓒⓘⓡⓒⓛⓔⓓ ⓛⓔⓣⓣⓔⓡⓢ. He was inspired by the excessive use of such letters in the tweets of Stephen Hackett.

ⓣⓗⓔ ⓐⓟⓟⓛⓔ ⓢⓣⓞⓡⓔ ⓘⓢ ⓓⓞⓦⓝ ⓙⓤⓢⓣ ⓛⓘⓚⓔ ⓔⓥⓔⓡⓨ ⓞⓣⓗⓔⓡ ⓐⓟⓟⓛⓔ ⓔⓥⓔⓝⓣ ⓑⓡⓑ ⓑⓛⓞⓖⓖⓘⓝⓖ
Stephen Hackett (@ismh) Oct 16 2014 7:56 AM

What’s good about Greg’s snippet group is that it works on both the Mac and iOS; what’s less good is the length of the abbreviations:

For example:

oooT gets you: Ⓣ
ooox gets you: ⓧ
ooo4 gets you: ④

Now it’s true that typing three o’s in a row isn’t much more time-consuming than typing just one, but I still prefer to type the text normally and then convert it to circled form. So I fired up Automator and made a Service to do it.

It took almost no time because I’d already made a few services that did similar things: s̸t̸r̸i̸k̸e̸t̸h̸r̸o̸u̸g̸h̸, dılɟ, and EBG13. The circled text service was most like the text flipper, so I copied it and did some quick editing.

In Automator, the circle service looks like this:

The Python script that runs when the Service is invoked is this:

python:
1:  # coding: utf8
2:
3:  from sys import stdin, stdout
4:
5:  pchars = u"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"
6:  cchars = u"ⓐⓑⓒⓓⓔⓕⓖⓗⓘⓙⓚⓛⓜⓝⓞⓟⓠⓡⓢⓣⓤⓥⓦⓧⓨⓩⒶⒷⒸⒹⒺⒻⒼⒽⒾⒿⓀⓁⓂⓃⓄⓅⓆⓇⓈⓉⓊⓋⓌⓍⓎⓏ⓪①②③④⑤⑥⑦⑧⑨"
7:  circler = dict(zip(map(ord, pchars), cchars))


Update 10/18/14
OK, this is kind of weird. There are apparently a couple of ways to get the Ⓜ character, and what I did originally caused the translation to fail with capital letters beyond M. The UTF-8 code for Ⓜ, in hex, is 24C2. But for some reason, when you insert it from the Mac Character Viewer, as I did when I first wrote this script, it inserts not just 24C2, but also FE0E, which is Variation Selector-15, a character of zero space.

That messed up the definition of the dictionary in Line 7 and caused all of the higher capital letters to point to a circled capital letter one lower than they should have. For example, S would turn into Ⓡ. To fix this problem, I opened IPython and gave it these two commands

m = unichr(9410)
print m


Because decimal 9410 is hex 24C2, this caused Ⓜ to print without the trailing zero-width character. I used it to clean up the cchars string, and now the service works as it should.

You can’t see any difference between the current script and what I originally posted, because the difference is an invisible character. This is the sort of thing that makes people hate computers.

What I should hate, though, is Apple for sticking an invisible character in where it doesn’t belong. I wonder if that bug has always been there.

The first line tells the Python interpreter that the source code is going to include UTF-8 characters. Lines 5–7 set up a dictionary in which the keys are the character codes of the regular characters and the values are the corresponding circled characters. This kind of dictionary is what the string translate method uses as its argument.1

Lines 8 looks more complicated than it is. Basically, it reads standard input, runs the translation, and writes the result to standard output. The messiness comes from the decode and encode methods, which are there to handle the non-ASCII characters. This arrangement is called the “Unicode sandwich,” and I learned about it by watching this excellent talk by Ned Batchelder.

The service is called and it appears in the submenu when text is selected. But using the submenu is a pain in the ass, so I defined shortcuts for this and the other text conversion services.

I’ve restricted the shortcuts to work only in Dr. Twoot, because Twitter is the only place2 I’ll use them.

The other services are structured the same way in Automator; the only differences are the Python scripts. The post from last year shows older versions of the scripts, which generally worked, but aren’t as robust as what I’m using now.

Here’s the script for flipping characters:

python:
1:  # coding: utf8
2:
3:  from sys import stdin, stdout
4:
5:  pchars = u"abcdefghijklmnopqrstuvwxyz,.?!'()[]{}"
6:  fchars = u"ɐqɔpǝɟƃɥıɾʞlɯuodbɹsʇnʌʍxʎz'˙¿¡,)(][}{"
7:  flipper = dict(zip(map(ord, pchars), fchars))
9:  a.reverse()
10:  stdout.write(''.join(a).encode('utf8'))


The script for striking through characters is this:

python:
1:  from sys import stdin, stdout
2:
4:  struck = u'\u0338'.join(unstruck)
5:  stdout.write(struck.encode('utf-8'))


Strikethrough works for both A̸S̸C̸I̸I̸ and most Ü̸ñ̸î̸ç̸ø̸ƌ̸é̸ characters, but it won’t work on Emoji. I guess Emoji don’t have whatever characteristics are necessary to function with combining characters.

The script for doing a ROT13 is this:

python:
1:  from string import maketrans
2:  from sys import stdin, stdout
3:
4:  alpha = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
5:  rot13 = 'nopqrstuvwxyzabcdefghijklmNOPQRSTUVWXYZABCDEFGHIJKLM'
6:  r13table = maketrans(alpha, rot13)

You’ll notice that I used maketrans in this script and that there’s no Unicode sandwich. That’s because the only characters that get converted are ASCII. Everything that isn’t an ASCII letter passes through untouched, so multi-byte characters don’t need to be decoded and encoded. That’s what what my testing shows, anyway.
1. In theory, you can use the string.maketrans function to create this kind of table, but it’s never worked for me when Unicode characters are involved.