Blogging from stdin

Lately I’ve been figuring out ways to wean myself from TextMate. While it’s possible that the open-sourcing of TM2 will turn out to be a great success, I wouldn’t lay money on it. And even if it does, it’s unlikely to happen quickly (remember when Netscape open-sourced Navigator?) and there’s no guarantee I’ll like the result.

So I need some tools to help me work in other editors. One of my favorite TextMate tools is Brad Choate’s Blogging Bundle, and I wanted something similar that I could call from any editor or from the command line. There’s probably a way to dig into the bundle and extract a few standalone scripts, but that seemed as time-consuming as just writing the few commands I need from scratch—especially since the Blogging Bundle is written in Ruby, a language I’m not comfortable with.

The two essential parts of a blogging system are:

I decided to write a Python script to handle the publishing. There are a few Python WordPress libraries floating around, but none of them seem to be the library, so I decided to write my script without such a library, using xmlrpclib instead. Apart from some idiosyncrasies in both xmlrpclib and the WordPress MetaWeblog implementation, it went pretty smoothly.

My goal was to start with text that looks like this:

Title: Blogging from stdin
Keywords: programming, python, blogging, wordpress, xmlrpc

Lately I've been figuring out ways to wean myself from
TextMate. While it's possible that the open-sourcing of
TM2 will turn out to be a great success, I wouldn't lay
money on it. And even if it does, it's unlikely to
happen quickly (remember when Netscape open-sourced
Navigator?) and there's no guarantee…

I’d then run it through a script that

  1. Parses the header lines (which could also include a line for the date and time at which the post is to be published).
  2. Publishes the post.
  3. Returns text that looks like the above, but with more header lines. This text could, if necessary, be edited and the post republished.

The return value should look like this:

Title: Blogging from stdin
Keywords: blogging, programming, python, wordpress, xmlrpc
Date: 2012-08-15 23:15:27
Post: 1913
Slug: blogging-from-stdin
Status: publish
Comments: 1

Lately I've been figuring out ways to wean myself from
TextMate. While it's possible that the open-sourcing of
TM2 will turn out to be a great success, I wouldn't lay
money on it. And even if it does, it's unlikely to
happen quickly (remember when Netscape open-sourced
Navigator?) and there's no guarantee…

This is basically the way the Blogging Bundle’s Post to Blog command works, but with fewer header lines. Because I’m not writing a general-purpose tool, I don’t need a line for the blog’s name; and because I don’t use categories here on ANIAT, I don’t need a line for that, either.

Here’s the script, named publish-post:

 1:  #!/usr/bin/python
 3:  '''
 4:  Take text from standard input in the format
 6:    Title: Blog post title
 7:    Keywords: key1, key2, etc
 9:    Body of post after the first blank line.
11:  and publish it to my WordPress blog. Return in standard output
12:  the same post after publishing. It will then have more header
13:  fields (see hFields for the list) and can be edited and re-
14:  published again and again.
16:  The goal is to work the same way TextMate's Blogging Bundle does
17:  but with fewer headers.
18:  '''  
20:  import xmlrpclib
21:  import sys
22:  import os
23:  from datetime import datetime, timedelta
24:  import pytz
26:  # Blog parameters (url, user, pw) are stored in ~/.blogrc.
27:  # One parameter per line, with name and value separated by colon-space.
28:  p = {}
29:  with open(os.environ['HOME'] + '/.blogrc') as bloginfo:
30:    for line in bloginfo:
31:      k, v = line.split(': ')
32:      p[k] = v.strip()
34:  # The header fields and their metaWeblog synonyms.
35:  hFields = [ 'Title', 'Keywords', 'Date', 'Post',
36:              'Slug', 'Link', 'Status', 'Comments' ]
37:  wpFields = [ 'title', 'mt_keywords', 'date_created_gmt',  'postid', 
38:               'wp_slug', 'link', 'post_status', 'mt_allow_comments' ]
39:  h2wp = dict(zip(hFields, wpFields))         
41:  def makeContent(header):
42:    "Make the content dict from the header dict."
43:    content = {}
44:    for k, v in header.items():
45:      content.update({h2wp[k]: v})
46:    content.update(description=body)
47:    return content
49:  # Read and parse the source.
50:  source =
51:  header, body = source.split('\n\n', 1)
52:  header = dict( [ x.split(': ', 1) for x in header.split('\n') ])
54:  # For uploading, the date must be in UTC and a DateTime instance.
55:  utc = pytz.utc
56:  myTZ = pytz.timezone('US/Central')
57:  if 'Date' in header:
58:    # Get the date from the string in the header.
59:    dt = datetime.strptime(header['Date'], "%Y-%m-%d %H:%M:%S")
60:    dt = myTZ.localize(dt)
61:    header['Date'] = xmlrpclib.DateTime(dt.astimezone(utc))
62:  else:
63:    # Use the current date and time.
64:    dt = myTZ.localize(
65:    header.update({'Date': xmlrpclib.DateTime(dt.astimezone(utc))})
67:  # Connect and upload the post.
68:  blog = xmlrpclib.Server(p['url'])
70:  if 'Post' in header:
71:    # Editing an old post.
72:    postID = int(header['Post'])
73:    del header['Post']
74:    content = makeContent(header)
75:    blog.metaWeblog.editPost(postID, p['user'], p['pw'], content, True)
76:  else:
77:    # Publishing a new post.
78:    content = makeContent(header)
79:    postID = blog.metaWeblog.newPost(0, p['user'], p['pw'], content, True)
81:  # Return the post as text in header/body format for possible editing.
82:  post = blog.metaWeblog.getPost(postID, p['user'], p['pw'])
83:  header = ''
84:  for f in hFields:
85:    if f == 'Date':
86:      # Change the date from UTC to local and from DateTime to string.
87:      dt = datetime.strptime(post[h2wp[f]].value, "%Y%m%dT%H:%M:%S")
88:      dt = utc.localize(dt).astimezone(myTZ)
89:      header += "%s: %s\n" % (f, dt.strftime("%Y-%m-%d %H:%M:%S"))
90:    else:
91:      header += "%s: %s\n" % (f, post[h2wp[f]])
92:  print header.encode('utf8')
93:  print
94:  print post['description'].encode('utf8')

I think it’s commented well enough, but there are a few points worth expanding on:

  1. I keep my blog’s XMLRPC server URL and my username and password in a .blogrc file in my home directory. The file is formatted like this:

    url: http://blahblahblah
    user: myusername
    pw: mypassword
  2. The script gets its input from stdin and returns its output to stdout rather than using files. This seemed like the most flexible arrangement, as I can always used piping and redirection if I need to hook the script up to particular files. An unlikely advantage of doing it this way: as I was debugging, I ran the script directly from the Terminal, piping pbpaste into it and pbcopy out—no need for test files.

  3. The documentation for WordPress’s MetaWeblog API has some errors, which I learned by exploring the return values from the metaWeblog.getPost command. The documentation says that the value of the mt_allow_comments field will be either open or closed; the value is actually either 1 or 0. It says the value of the mt_keywords field will be an array; it’s actually a string with the keywords separated by commas.
  4. The dateCreated date_created_gmt field has to be expressed as a DateTime object. Confusingly, this is not an instance of the standard Python datetime class. It’s a special class defined in xmlrpclib. Some of the messing around you see in the code consists of handling this distinction.
  5. When publishing, the dateCreated date_created_gmt field has to be given in UTC. Because I prefer to work in US/Central, that field is given in must be converted back to my local time zone when a post is retrieved. There’s more messing around in the code to convert back and forth between time zones. I use the nonstandard pytz library to do the conversions.
  6. Lines 35 through 39, where I define the header fields and relate them to the field names in the WordPress MetaWeblog API, may look weird to you. Why am I defining two lists and then ziping them into a dictionary? Why not just make the h2wp dictionary directly? It’s because you can’t define the order of a dictionary’s keys, and I want the header fields ordered in a particular way in the returned text. The hFields list seemed like the simplest way to do that.

So far, the script is working well, but I’m under no illusions—its error handling is practically nonexistent, and I’m sure I’ll run into problems eventually. I’ll solve them as they come along.

Update 8/16/12
As expected, there were bugs, and they didn’t take long to appear.

First, I had forgotten to encode the output to handle non-UTF characters. That was a pretty easy fix.

More troublesome was my confusion over the dateCreated field. When I’d upload a post with dateCreated set to UTC, the publication time appeared correct (in US/Central) in the WordPress web interface, but the post wouldn’t get published at the indicated time. Very frustrating. After examining the metaWeblog.getPost output, I saw that the date_created_gmt field was 10 hours ahead of dateCreated, not 5 hours as it should be. Somehow, the time zone correction was being doubled.

I decided to dispense with dateCreated entirely and just use date_created_gmt.1 I convert from local time to UTC before publishing and convert the other way after retrieving. I’m sure there’s a way to use dateCreated correctly, but I don’t have the patience to look into it.

Thanks to reader Adam Tinsley for pointing out the publication time bug.

One last tool: I made this TextExpander snippet for inserting the header:

TextExpander blog header snippet

(Yes, I see the typo in the keywords. I fixed it before publishing.)

The snippet uses the new optional fill-in feature for the date. If I include the Date line, the post gets the date/time in it. If I don’t, the post gets the date/time when the command is run.

  1. Do you get the sense that the WordPress MetaWeblog API was written by different people at different times with very different ideas about naming standards? CamelCase for one date field, underscores for another. Every time I write them out, I have to check which is which. 

3 Responses to “Blogging from stdin”

  1. Luis Rivera says:

    This is probably me over thinking things but this solution has you store your password in plain text on your computer. Would there be any way to integrate with 1Password or something similar for the sake of security and even ease of use (not having to update the file).

    The next part is going to be me not knowing what I’m talking about (even more so) and it will probably show I don’t know that 1Password integration would work from the command line but in OS X maybe keychain access would? At least it would hide the password from plain text, I think.

  2. Daniel Jalkut says:

    Following up on Luis’s comment, a reasonable way of going about securely storing the password would be to put it in your Keychain but use the “security” tool from the command line to access it. You will get a one-time permission prompt like you do from an app, allowing security to access the password.

    The problem, then, is anybody who has the ability to run security will be able to access the password. Ideally the keychain connection could be made directly from the python script, but I don’t know off the top of my head if there are any built-in keychain bindings for Python.


  3. Dr. Drang says:

    Maybe I’m just too sanguine about this, but I don’t see the plaintext password as a big problem. My feeling is that if someone gets into my home directory, I’ve got bigger problems than him messing with my blog. I’d probably look at this differently if my computers were less secure physically.

    Still, there’s no sense in inviting trouble. I’ll look into both security and direct Python methods for encrypting the password. I can also dig around in the Blogging Bundle to see how it treats the password.

    That didn’t take too long. TextMate’s Blogging Bundle gets the password from the Keychain by calling security from within a Ruby script. It should be easy to do the same in Python.

    It does seem a little weird that security will cough up the password so readily. I realize that it does so because I gave security that privilege when I initialized the bundle. What that means, though, is that anyone who sits down at my computer (assuming it’s logged in to my account) can issue the security command with a few easily determined parameters and get the password. While this is certainly more work than looking for the password in a file, it doesn’t strike me as that much more secure. Maybe I don’t understand the security command as well as I think I do.