Time zones in blackbirdpy

After figuring out how to get time zones straightened out in my Twitter archiving script, it seemed like a good idea to use that knowledge to improve my embedded tweets before I forget how.

To embed tweets in my blog posts, I use my own variant on Jeff Miller’s blackbirdpy. Jeff’s Python script was itself a rewrite of Robin Sloan’s original JavaScript tweet embedder, Blackbird Pie.

Why don’t I just use Twitter’s own embedding code? Two reasons:

  1. I don’t especially like the way it looks.
  2. I don’t like the idea of having Twitter’s code injected into my pages. (We all know how painful that can be.)

In the past, the timestamp of the tweets I embed here have included the date only. (You can see a couple of examples in my previous post.) That’s because the timestamp returned by the Twitter API is in UTC, and it was easier to just take that value and use the date part than try to work out the time zone conversion. I wasn’t truly satisfied with the result, because evening tweets here in the US/Central time zone were often stamped with the following day, but it was easier to live with that small discrepancy than learn how to do the conversions properly.

The tweet archiving of the past few days has taught me a bit about handling time zones in Python, so it’s time to take what I’ve learned and fix up blackbirdpy.

My blackbirdpy system has three parts:

  1. A Python script, blackbird.py, which, when given the URL of a tweet as an argument, returns a chunk of HTML with the contents and attribution information that tweet.
  2. A TextExpander snippet that grabs the URL of the frontmost Safari tab, feeds it to blackbird.py, and prints the HTML output.
  3. A JavaScript function, styleTweets, that formats the tweet the way I like for presentation here on the blog. It uses the Twitter API to get some of the style information.

Although I did make a small change to styleTweets, most of the changes were in blackbird.py, and that’s the script I’m going to focus on here. If you care, you can see all the code and its history of changes in the GitHub repository.

The most important changes to blackbird.py are in these two functions:

 80:  def timestamp_string_to_datetime(text):
 81:      """Convert a string timestamp of the form 'Wed Jun 09 18:31:55 +0000 2010'
 82:      into a Python datetime object."""
 83:      tm_array = email.utils.parsedate_tz(text)
 84:      dt = datetime(*tm_array[:6]) - timedelta(seconds=tm_array[-1])
 85:      dt = pytz.utc.localize(dt)
 86:      return dt.astimezone(myTZ)
 89:  def easy_to_read_timestamp_string(dt):
 90:      """Convert a Python datetime object into an easy-to-read timestamp
 91:      string, like 'Wed Jun 16 2010 5:22 PM CST'."""
 92:      return dt.strftime("%a %b %-d %Y %-I:%M %p %Z")

The use of parsedate_tz from the email.utils library in Line 83 is a holdover from Jeff Miller’s code. You might think I could use the strptime function from datetime to parse the timestamp, but whenever I try that, using

dt = datetime.strptime(text, "%a %b %d %H:%M:%S %z %Y")

I get the error

ValueError: 'z' is a bad directive in format '%a %b %d %H:%M:%S %z %Y'

But parsedate_tz handles the UTC offset just fine.

The final item in the list returned by parsedate_tz is the UTC offset, in seconds. As far as I know, this will always be zero when the timestamp comes from a Twitter API call, but Line 84 will do the conversion to UTC even if it isn’t. Line 85 then makes the datetime object “time zone aware,” and Line 86 converts it to my local time before returning it. The myTZ variable is defined earlier in the code as

myTZ = pytz.timezone('US/Central')  

(I said in my last post that the pytz library was the way to go if you need to deal with multiple time zones. That still stands.)

The easy_to_read_timestamp_string function makes use of the “hyphened” forms of two of the strftime codes. That removes any leading zeros from the day and the hour, another little time trick I learned recently.

The results look like this:

According to my corner of Twitter, the big takeaway from CERN’s presentation is that the slides were ugly, and that they used Comic Sans.
  — potatowire (@potatowire) Wed Jul 4 2012 1:17 PM CDT

In many cases, of course, the extra precision this timestamp gives is unnecessary, but I have written posts in which I embedded a few tweets from the same day and including the times adds context to the tweets.

By the way, the time zone given at the end of the timestamp is DST-aware and uses the time in effect when the tweet was written. If I embed a tweet that was written during the winter, the timestamp is given in standard time:

Ordering cat food via Amazon Prime seems like the kind of thing I should tell you about on Twitter.
  — mikemorrow (@mikemorrow) Fri Feb 10 2012 9:35 AM CST

Why should tweets written by people in (possibly) other time zones be presented in Central Time? Because it’s my blog, that’s why.