Archive for the ‘blogging’ Category

What’s wrong with Roger Ebert’s blog

It’s not the content, of course, it’s the formatting. Have you tried to read his RSS feed on an iPhone?

I started reading Roger’s latest blog entry this morning and ran into the same problem I always do when reading him on my iPhone.

The screenshot above is from the MobileRSS feed reader, but I get basically the same horrible formatting when reading it through NetNewsWire

and through the Google Reader mobile page.

Everything past the first couple of paragraphs gets squeezed down to pass through a narrow chute.

Grabbing the feed via

curl http://blogs.suntimes.com/ebert/atom.xml > ebert.rss

and isolating the area where the formatting goes crazy, we get

<p> Todd McCarthy reviewed films for Variety for 31 years.
He was the ideal critic for the paper -- better, we now
realize, than it deserved. His reviews and the reviews of
Kirk Honeycutt at the Hollywood Reporter were frequently the
first reviews of a new film to see print. Honeycutt
fortunately continues. <br /> </p>]]>

<![CDATA[<blockquote><blockquote><blockquote>Films are
traditionally screened for the "trades" before anyone else.
Historically, when independent theater owners around the
world booked their own theaters, they depended on Variety's
advance reviews to plan their bookings. These days theaters
are booked by accountants in Hollywood, often before a film
has been completed. Now that it's "product," it doesn't
matter so much if it's any good or not.

and the problem is clear: it’s the three consecutive <blockquote> tags that lead off the skinny part. Why are they there? The answer comes from looking not at the feed, but at the site itself.

Here’s the main page of Roger’s1 blog,

and here’s the page for the latest entry

Each entry on main page has an image and a paragraph or two of text. You’ll note that the length of the text matches the height of the image almost exactly, a feat that probably comes easily to a guy who’s been writing newspaper copy for five decades, but which seems amazing to me.

The triple <blockquote> comes right after those lead paragraphs and is what gives the subsequent text its left indentation. And turns that text into a thin trickle running down the center of my iPhone.

(The indentation also appears when I read his feed on my computer, of course, but it’s not as annoying on a full-sized screen.)

I don’t know whether it’s Roger himself that’s putting in those <blockquote>s or whether it’s some blogging program he’s using, but whatever the source, it’s the old problem of using HTML for formatting instead of semantics. I wish one of Roger’s web-savvy friends—Andy Ihnatko, say—would step in and give him a little CSS assistance.


  1. I call him Roger instead of Mr. Ebert not just because his persona in print and on TV makes everyone feel like his friend. And not just because we share an alma mater and happy memories of Champaign-Urbana. No, there’s a deeper bond.

    Back in the late 80s, my wife and I were driving down through the center of Illinois on I-57. We pulled into a rest stop and noticed a BMW with the license plate ROSEBUD (or maybe ROSEBD) in the parking lot. As I went into the men’s room, I passed familiar-looking portly guy with big glasses coming out the door. It wasn’t until I got back to my car that I realized I’d just had a brush with greatness.

    It’s the intimate relationship that comes from nearly sharing a rest room that puts Roger and me on a first name basis. 


Headline Fallows

The Atlantic rolled out some big changes to its website a couple of days ago. Normally, I wouldn’t notice something like this, because even though I’m a regular reader of James Fallows’ blog, I almost never visit the site itself. As I do with all my favorite blogs, I subscribe to his RSS feed and read his posts in Google Reader. But since the redesign, I can’t do that anymore.

Oh, there’s still a feed, but it provides only the headlines of Fallows’ posts, nothing more. Not even the first paragraph or two to give a you a decent sense of the post’s topic. I assume the idea behind this change is to force us to go to the main site, pumping up the pageviews for The Atlantic’s advertisers. It won’t work; you can’t force someone to follow a link, and readers who’ve jumped on the RSS train will not be jumping off.

Merlin Mann has written a couple of tart posts today about the stupidity of this change. I’m more disappointed than angry, but the source of our displeasure is the same: we like reading Fallows, and we will read much less of him because of the anemic new feed. I sent this email to Fallows:

Is there some way you can prevail upon the Atlantic’s webmasters (and their masters) to return the RSS feed to providing the full text of your posts? I understand the need to make money and would not complain if the feed included ads. Many of the feeds I subscribe to have ads (Talking Points Memo, for example), and I stay subscribed to them. But I won’t continue to subscribe to a feed that provides only headlines.

I’m sure I’m not alone in this. People who do a lot of reading online have gotten used to using RSS and will not go back to the old way of clicking back and forth between dozens of sites. Especially since much of our blog reading is now done on our smartphones.

I mentioned Talking Points Memo because I know it’s a site Fallows is familiar with. I could just as easily have mentioned Daring Fireball or TidBITS. They’ve all figured out ways to get ads in their feeds, meeting the advertising requirements necessary to keep their businesses going while still providing articles in a form their readers want.

Update 2/28/10
Fallows sent me (and, apparently, about 800 other people) a polite response, agreeing with our complaints. Later came this post, acknowledging the RSS feed problem, and this one, telling us that it’s been fixed. I’m not sure that it has been fixed just yet—the “fixed” post still hasn’t appeared in my RSS reader—but it’s clear that a fix is at least on the way.

Interestingly, the headline-only feeds were not a bad commercial decision; they were just bad programming. Never attribute to malice that which can be explained by incompetence.


CWOB without tweets

Looking through my RSS reader this morning, I noticed that Andy Ihnatko has installed a WordPress plugin that collects and summarizes his Twitter stream from the previous day and publishes it as a blog post on the Celestial Waste of Bandwidth.1 Since I

  1. subscribe to his RSS feed
  2. follow him on Twitter, and
  3. don’t feel the need to read everything twice,

I’ve used Yahoo! Pipes to create a customized feed that filters out the Twitter summary posts.

The URL for the feed is at this link.

Regular readers of this blog may be arching their eyebrows. Didn’t Dr. Drang inflict exactly the same sort of redundancies on us last year? Yes, I did, and I’m sorry for it. In my defense, I’ll point out that I

I wouldn’t be surprised to see Andy go through the same steps.

Update 2/17/10
I think Andy has turned off the daily Twitter update, but a few days ago a weekly Twitter summary showed up in my RSS reader. I’ve reworked the above-linked Yahoo! Pipes filter to get rid of those, too. The screen shot shows the current filter.


  1. Ah, I remember when it was just a Colossal Waste of Bandwidth. 


An odd coincidence

Less than 24 hours after my last post, in which I said I wanted to explore build systems other than make to generate my serverless wiki project notes files, Allan Odgaard writes a post on make that leads me to think I shouldn’t be too hasty in abandoning a software tool that’s stood the test of time. Allan is a smart guy who’s written some great stuff and sounds very authoritative when he says computer sciencey things like “directed acyclic graph.” Also, the last example in his post is a frighteningly close match to what I’m doing, so I think I’ll listen to him.

He says this post is the first of two on the topic of make. I look forward to Post 2 almost as much as I do to TextMate 2. Hope I don’t have to wait as long.


PHP Markdown Extra Math

Tuesday’s MathJax post reminded me that I’ve never made public my modifications to PHP Markdown Extra that allow me to easily write equations here on the blog. So I took a virgin copy of Michel Fortin’s latest version of PHP Markdown Extra, applied my modifications, and put the result in a GitHub repository. From the repository’s README:


PHP Markdown Extra Math is an extension of Michel Fortin’s PHP Markdown Extra, a PHP script for converting text written in Markdown to HTML. The extension consist of adding support for mathematical equations written in LaTeX to be processed by Davide Cervone’s jsMath system.

Here’s how it works. The author, writing in Markdown, inserts inline equations like this

where \(\alpha = (t_1 - t_0)/L\) is the rate at which the thickness increases

enclosing the math in a \( … \) pair, just as if writing in LaTeX. PHP Markdown Extra Math converts that to

where <span class="math"> \alpha = (t_1 - t_0)/L </span> is the rate at which the thickness increases

which is then converted by jsMath into

where \alpha = (t_1 - t_0)/L is the rate at which the thickness increases

Similarly, display Math is written like this:

Putting this into Castigliano's equation, we get

\[\Delta = \frac{\partial U^*}{\partial F} = \frac{12F}{Eb} \int_0^L \frac{x^2}{(t_0 + \alpha x)^3} dx\]

which PHP Markdown Extra Math will turn into this HTML

<p>Putting this into the Castigliano equation, we get</p>

<div class="math">\Delta = \frac{\partial U^*}{\partial F} = \frac{12F}{Eb} \int_0^L \frac{x^2}{(t_0 + \alpha x)^3} dx</div>  

which, in turn, will be rendered by jsMath like this:

Putting this into Castigliano’s equation, we get

\Delta = \frac{\partial U^*}{\partial F} = \frac{12F}{Eb} \int_0^L \frac{x^2}{(t_0 + \alpha x)^3} dx

The examples were taken directly from my post on Castigliano’s Second Theorem.

For posts that have just one or two equations, there’s not much to be gained by using the \( … \) and \[ … \] notation in place of <span class="math"> … </span> and <div class="math"> … </div>, but it’s a real timesaver when the equations start to pile up.

If you’re a jsMath aficionado, you may be wondering why I don’t take advantage of its tex2math extension, which will interpret the \( … \) and \[ … \] notation directly. It’s because Markdown already has a backslash escape syntax that works with parentheses and brackets, and there’s no way for me to keep the notation straight if Markdown and jsMath fighting with each other. So I have the tex2math extensions turned off.


Another blog update

I’ve made two changes to the blog in the last few days:

  1. Traditional comments (not Twitter comments) have been turned on.
  2. My Twitter entries are no longer being posted here.

As I’ve said before, I didn’t implement commenting when I changed the blog from Movable Type to WordPress because I was tired of comment spam. As time went on, the thought of digging into my WordPress template and making a comment section consistent with the overall style of the blog gave me the willies, and I just kept putting it off. It turned out, though, that editing an existing template to make a reasonably decent looking comment section took just an hour or so.

The commenting rules are:

The experiment in commenting via Twitter is over. It worked fairly well, but had two big drawbacks:

  1. It forced you to the Twitter web page, even if you normally tweet via a standalone client. Often this forced a login step, taking away the spontaneity of Twitter.
  2. Sometimes 140 characters just isn’t enough.

The other Twitter experiment, gathering my tweets from the previous day and publishing them as a single post here, is also over. Although everything I write is a gem of transcendent brilliance, I’m tired of seeing Twitter posts that link to a regular post immediately below. There will be no more crossing of the streams.

As an adjunct to stopping the Twitter posts, I’ve also removed the links to the Twitterless RSS feeds. The feeds themselves, implemented through Yahoo! Pipes, still exist, so if you’ve subscribed that way your subscription will still work. But they’ll no longer give results different from the standard feeds.

Tags:


Revision control in WordPress

I’m sure this is old hat to many if not all of you, but I’m posting it anyway, so I have a reference to it. Somewhere around version 2.6, WordPress added a Revisions feature. Each time you revise a post, a new database entry is made and all the older versions are kept. The idea was to be able to see the revisions as you would in a wiki. Not a bad idea for some blogs, I’m sure, but not anything I care about.

In fact, I actively dislike the feature, because it means that sequential posts will no longer have sequential post numbers. Although I don’t use post numbers in my normal permalinks, I do use them with the Twitter comment system and I like them to be sequential.

When I updated the blog from 2.3 to 2.8.4 a couple of weeks ago, I didn’t know anything about the revision system, so when I saw the post numbers incrementing by two or three or four instead of one, I worried that the blog database had been hacked in some way. A bit of Googling led me to both the explanation and the solution. I added the line

define(’WP_POST_REVISIONS’, false);

to my wp-config.php file just before writing this post. Presumably, the post numbering will be sequential from now on.

Tags:


Updated Twitter posting and an apology

Let’s start with the apology. Earlier this week, RSS subscribers got repeated notices of the daily roundup of my Twitter updates. These posts are generated automatically overnight; because of a screwup on my part, two posts were being made each day and the feeds followed suit. I’m sorry about that. It took me a couple of days to figure out what was wrong, but it seems to be fixed now.

The screwup came as I was trying to make the twitterpost script more robust. When it’s working, twitterpost

I have a launchd process set up to run twitterpost in the middle of the night to automatically generate a blog post of my deathless thoughts. But it had stopped working well.

You may recall that at the beginning of the month Twitter was quite unreliable—more so than usual. For several days in a row, twitterpost had been unable to connect when it ran, so it would just fail, and there’d be no automatic posting for that day. I’d have to run the script by hand when Twitter was responsive. To fix this problem, I rewrote twitterpost to check whether the connection was successful and try again several minutes later if it wasn’t. Here’s the new source code:

  1:  #!/usr/bin/python
  2:  
  3:  import twitter
  4:  from datetime import datetime, timedelta
  5:  from time import sleep
  6:  import pytz
  7:  import wordpresslib
  8:  import sys
  9:  import re
 10:  
 11:  # Parameters.
 12:  tname = 'drdrang'                   # the Twitter username
 13:  chrono = True                       # should tweets be in chronological order?
 14:  replies = False                     # should Replies be included in the post?
 15:  burl = 'http://www.leancrew.com/all-this/xmlrpc.php'    # the blog xmlrpc URL
 16:  bid = 0                             # the blog id
 17:  bname = 'drdrang'                   # the blog username
 18:  bpword = 'seekret'                  # not the real blog password
 19:  bcat = 'personal'                   # the blog category
 20:  tz = pytz.timezone('US/Central')    # the blog timezone
 21:  utc = pytz.timezone('UTC')
 22:  
 23:  # Get the starting and ending times for the range of tweets we want to collect.
 24:  # Since this  is supposed to collect "yesterday's" tweets, we need to go back
 25:  # to 12:00 am of the  previous day. For example, if today is Thursday, we want
 26:  # to start at the midnight that  divides Tuesday and Wednesday. All of this is
 27:  # in local time.
 28:  yesterday = datetime.now(tz) - timedelta(days=1)
 29:  starttime = yesterday.replace(hour=0, minute=0, second=0, microsecond=0)
 30:  endtime = starttime + timedelta(days=1)
 31:  
 32:  # Create a regular expression object for detecting URLs in the body of a tweet.
 33:  # Adapted from
 34:  # http://immike.net/blog/2007/04/06/5-regular-expressions-every-web-programmer-should-know/
 35:  url = re.compile(r'''(https?://[-\w]+(\.\w[-\w]*)+(/[^.!,?;"'<>()\[\]\{\}\s\x7F-\xFF]*([.!,?]+[^.!,?;"'<>()\[\]\{\}\s\x7F-\xFF]+)*)?)''', re.I)
 36:  
 37:  # A regular expression object for initial hash marks (#). These must
 38:  # be escaped so Markdown doesn't interpret them as a heading.
 39:  hashmark = re.compile(r'^#', re.M)
 40:  
 41:  ##### Twitter interaction #####
 42:  
 43:  # Get all the available tweets from the given user. If Twitter is unresponsive,
 44:  # wait a while and try again. If it's still unreponsive after several tries,
 45:  # just give up.
 46:  api = twitter.Api()
 47:  tries = 9
 48:  trial = 0
 49:  while trial < tries:
 50:      trial += 1
 51:      try:
 52:          statuses = api.GetUserTimeline(user=tname)
 53:      except:
 54:          print "Can't connect to Twitter on try %d..." % trial
 55:          sleep(15*60)
 56:          continue
 57:      break
 58:  if trial >= tries:
 59:      print 0
 60:      sys.exit()
 61:  
 62:  # Tweets are in reverse chronological order by default.
 63:  if chrono:
 64:      statuses.reverse()
 65:  
 66:  # Collect every tweet and its timestamp in the desired time range into a list.
 67:  # The Twitter API returns a tweet's posting time as a string like 
 68:  # "Sun Oct 19 20:14:40 +0000 2008." Convert that string into a timezone-aware
 69:  # datetime object, then convert it to local time. Filter according to the
 70:  # start and end times.
 71:  tweets = []
 72:  for s in statuses:
 73:      posted_text = s.GetCreatedAt()
 74:      posted_utc = datetime.strptime(posted_text, '%a %b %d %H:%M:%S +0000 %Y').replace(tzinfo=utc)
 75:      posted_local = posted_utc.astimezone(tz)
 76:      if (posted_local >= starttime) and (posted_local < endtime):
 77:          timestamp = posted_local.strftime('%I:%M %p').lower()
 78:          # Add or escape Markdown syntax.
 79:          body = url.sub(r'<\1>', s.GetText())      # URLs
 80:          body = hashmark.sub(r'\#', body)          # initial hashmarks
 81:          body = body.replace('\n', '  \n')         # embedded newlines
 82:          if replies or body[0] != '@':
 83:              if timestamp[0] == '0':
 84:                  timestamp = timestamp[1:]
 85:              tweet = '[**%s**](http://twitter.com/%s/statuses/%s)  \n%s\n' % (timestamp, tname, s.GetId(), body)
 86:              tweets.append(tweet)
 87:  
 88:  # Obviously, we can quit if there were no tweets.
 89:  if len(tweets) == 0:
 90:      print 0
 91:      sys.exit()
 92:  
 93:  # A line for the end directing readers to the post with this program.
 94:  lastline = """
 95:  
 96:  *This post was generated automatically using the script described [here](http://www.leancrew.com/all-this/2009/07/updated-twitter-to-blog-script/).*
 97:  """
 98:  
 99:  # Uncomment the following 2 lines to see the output w/o making a blog post.
100:  # print '\n'.join(tweets) + lastline
101:  # sys.exit()
102:  
103:  ##### Blog interaction #####
104:  
105:  # Connect to the blog.
106:  blog = wordpresslib.WordPressClient(burl, bname, bpword)
107:  blog.selectBlog(bid)
108:  
109:  # Create the info we're going to post.
110:  post = wordpresslib.WordPressPost()
111:  post.title = 'Tweets for %s' % yesterday.strftime('%B %e, %Y')
112:  post.description = '\n'.join(tweets) + lastline
113:  post.categories = (blog.getCategoryIdFromName(bcat),)
114:  
115:  # And post it.
116:  newpost = blog.newPost(post, True)
117:  print newpost

The new stuff is in Lines 43-60. The connection to Twitter is now in the body of a loop. If the connection fails, the connection attempt is repeated 15 minutes later, for a maximum of 9 attempts over a two-hour period. If the connection succeeds, the process breaks out of the loop and continues on to create the post.

I tested the new twitterpost by running it with my network cable initially disconnected (Twitter wouldn’t oblige me by being down during testing). After a few loops, I’d reconnect the cable and the rest of the script would execute. It seemed to be working perfectly, so I was surprised to find the repeated posts the next day.

My first thought was that some mistake in the loop was making the script stutter. But it wouldn’t stutter when I tested it, and the connection to the blog wasn’t even in the loop. Eventually I learned that I had somehow created a second launchd process; the posts were repeated because twitterpost was being run twice each night. Killing the second launchd process eliminated the repeated post.

Update 9/15/09
Hey, it works! Last night, the script couldn’t connect to Twitter for an hour. I have it set to start at 1:15 am, but last night’s post is timestamped 2:15 am. Launching Console to look at the system log, I found this:

Sep 15 02:15:05 drang com.leancrew.twitterpost[45665]: Can't connect to Twitter on try 1...
Sep 15 02:15:05 drang com.leancrew.twitterpost[45665]: Can't connect to Twitter on try 2...
Sep 15 02:15:05 drang com.leancrew.twitterpost[45665]: Can't connect to Twitter on try 3...
Sep 15 02:15:05 drang com.leancrew.twitterpost[45665]: Can't connect to Twitter on try 4...
Sep 15 02:15:05 drang com.leancrew.twitterpost[45665]: 998

Four failures and then a success. I know the timestamps make it look like all the attempts were made at 2:15, but that’s because the script doesn’t flush its output as it runs; all the output hits the system log when script finishes. I saw the same thing when I was testing the script with the network cable disconnected.

I did my best to simulate a failed connection in my testing, and I thought it would work, but it’s still nice to see it doing what it’s supposed to in a real world situation.

Because I felt bad about sending out repeated feeds, and because I began to question the value of these automatic tweet posts, I decided to create new feeds that excluded the Twitter posts. Patrick Mosby had already done this through Yahoo! Pipes, and my first thought was to just link to his feed. But I didn’t want to be dependent on his maintaining that feed, so I just copied his work on my own Yahoo! account.

So now there are four feeds: an RSS 2.0 feed with every post, an Atom feed with every post, an RSS 2.0 feed without the Twitter posts, and an Atom feed without the Twitter posts.

I’ve never understood the niceties of RSS 2.0 v. Atom, but since WordPress and Yahoo! Pipes make it easy to provide both, that’s what I’ll do.

Tags:


An update on WordPress

Late last night I upgraded the WordPress engine for the blog to 2.8.4, the version that’s supposed to be resistant to this nasty worm that’s working its way across the internets. I mentioned in this post yesterday that although “And now it’s all this” hadn’t been hit, I’d be upgrading and things might be a bit weird here for a couple of days. But the upgrade went very smoothly and quickly. I added an addendum to that effect to yesterday’s post, but I thought a bit more detail was in order.

Let me first confess that I was one of those people the WordPress developers hate. I installed WP 2.3 back in January of 2008 and never upgraded it until yesterday. Why not? Well, laziness would be the first reason, but in my defense I must say the WordPress upgrade instructions for older versions were written in a way that emphasized the pitfalls of the process. In fact, they’re still written that way. It’s quite different from the installation instructions, which is all lollipops and fuzzy kittens. So I got scared and put it off. And off.

Also, the first upgrade from 2.3 came out right after I’d installed it. I’d just gone through a lot of work transferring the blog from Movable Type on a different host, and I wasn’t in the mood to go through that kind of hassle again. I wanted a chance to write the blog before I had to administer it again.

As it turned out, my worries were unfounded. Here’s what I did (after backing up all the files and the database):

  1. Deactivated all the plugins.
  2. Downloaded the latest WordPress tarball and put it on the server.
  3. Untarred it into a directory named “wordpress” (this is the default).
  4. Copied the unique things from my all-this/ directory (where the 2.3 blog lived) to the new wordpress/ directory. This included
    • folders of images and sounds,
    • my theme,
    • the plugins,
    • a JavaScript file that (note to self) really ought to be in with my theme,
    • my favicon,
    • an .htaccess file, and
    • a file used by Google Analytics.
  5. Opened wordpress/wp-config-sample.php, entered the administrative info from all-this/wp-config.php, and saved it as wordpress/wp-config.php.
  6. Renamed all-this/ to all-this-old/
  7. Renamed wordpress/ to all-this/.
  8. Opened all-this/wp-admin in my browser and clicked the button to update the database.
  9. Relogged in to all-this/wp-admin, updated the flickrRSS plugin, and activated all the plugins.
  10. Tried several pages and links to convince myself that the blog was really up and running.
  11. Deleted all-this-old/.

This may seem like a lot of steps (and it is when compared to the automatic upgrade that I should be able to do from now on), but it was quite straightforward. The only reason it took as long as an hour was because I kept checking and rechecking my steps as I went along.

So I’ve learned my lesson and will upgrade promptly from now on.

As for the worm itself (is it really a worm? I thought worms were programs that run independently, not within the framework of another program), I’m not sure yet how bad it really is. Many people are reporting being hit; many, like me, are reporting no problems. But so far I’ve seen no statistics, just anecdotes. Maybe there’ll be better information after the holiday.

Tags:


Blog upgrade time

Via Gruber, a MovableType fan, I see that old WordPress installations are under attack. Whoop! Whoop! Run to the nearest Tube station! Release the barrage balloons!

Sorry, I’ve been listening to a lot of BBC recently.

Anyway, as far as I can tell, this little backwater of the internets has yet to be blitzed, but it’s probably as good a time as any to do an upgrade. We Yanks have a three-day weekend and I’m a little under the weather, so a bit of computer work seems in order. I’ve been backing things up to my local machine and will probably start dismantling and reassembling the blog in an hour or two.

So don’t be surprised if things are a bit wonky for a day or two. But eventually this blog will move forward into broad, sunlit uplands. And if it lasts for a thousand years, men will still say, “This was its finest hour.”

Update 9/6/09
Well, that was pretty painless. I didn’t get started until about an hour ago, and now it’s all done.1 I just followed WordPress’s simple upgrade instructions (the Three Step Manual Upgrade—the version I was running before was too old to use the Automatic Upgrade), thought everything through before I did it, and used a checklist to make sure I didn’t forget anything.

I think it helped that my theme is very simple and that I use only a few plugins. The flickrRSS plugin was upgraded with a single click. I wouldn’t have upgraded the PHPMarkdown Extra plugin even if a newer one was available, because I’m using a customized version that handles equations.

Overall, an almost pleasant experience. I can only hope the Snow Leopard upgrade I plan to do next week is as smooth.

Tags:


  1. It really was my finest hour.