Twitter's shortened links

A few days ago Brett Terpstra tweeted a link to this post on the Twitter development blog. Changes to the way Twitter handles in-tweet links are coming, some in the indeterminate future of “eventually,” some in just a week. I needed to make some changes to Dr. Twoot to accommodate Twitter’s new ways.

The changes are summarized in this paragraph:

Beginning August 15th, when a user tweets or sends a direct message containing a URL 20 characters long or greater (the length of URLs wrapped with t.co), the URL will automatically be converted to a t.co-wrapped link. We will eventually wrap all links, regardless of length, but until then there’s nothing you need to do to support this change. When we’re ready to wrap all links, we’ll give you plenty of time and make another announcement.

I’ve been using the Metamark service for shortening URLs for quite some time. I built a set of TextExpander snippets to make the operation painless. But Metamark’s shortened URLs are 20 characters long—just long enough to trip the auto-shortening that Twitter’s going to start doing on the 15th. I could continue to use Metamark, of course, but all my links would be doubly indirect: The t.co link made by Twitter would point to the xrl.us link made by Metamark, which would then point to the page I’m tweeting about. That’s a waste.

In addition to saving one level of indirection, there are some real advantages to using Twitter’s shortening service. Tweets now (optionally) include something called Tweet Entities, metadata about images, URLs, mentions, and hashtags embedded in the tweet. The URL metadata includes

The advantage of using t.co shortening in my tweets is that my followers will see the display URL on the Twitter web page, in the official Twitter clients, and in up-to-date third-party clients. This is better than seeing an anonymous http://xrl.us/xxxxxx link.

Furthermore, by querying the Tweet Entities, Dr. Twoot can become an up-to-date Twitter client and show URLs just like the big boys do.

Using t.co-shortened links in my tweets could be as simple as just clicking this option on the Twitter settings page for Dr. Twoot,

Dr. Twoot's link setting

and never using any of my URL-shortening TextExpander snippets. The problem with taking this easy approach is that Dr. Twoot’s character countdown display would be incorrect.

Character countdown

The countdown is fine when there’s no URL in the tweet, but without any changes it would count the full length of a URL instead of just the 20 characters it’s going to take up after being shortened. It would be nice if Twitter gave us access to the shortened URL before the tweet is posted, but they don’t. So I had to rewrite the countdown code to account for what the length of the tweet will be after shortening. Here it is:

javascript:
378:  function charCountdown() {
379:    body = $("#status").val();
380:    charsLeft = 140 - body.length;
381:    urlRE = new RegExp(URL_RE, 'g');
382:    matches = body.match(urlRE);
383:    if (matches) {
384:      charsLeft -= matches.length*SURL;
385:      charsLeft += matches.join('').length;
386:    }
387:    if (charsLeft <= 20) {
388:      $("#count").removeClass("normal");
389:      $("#count").addClass("warning");
390:    }
391:    else {
392:      $("#count").removeClass("warning");
393:      $("#count").addClass("normal");
394:    }
395:    if (charsLeft == 0) {
396:      $("#count").html("Twoosh!");
397:    }
398:    else {
399:      $("#count").html(String(charsLeft));
400:    }
401:  }    

The input field for the tweet has an ID of “status,” so that’s the string that’s being queried. URL_RE is a regular expression that does a pretty good job of detecting URLs. I use it a couple of times in Dr. Twoot, so it’s defined at the top of the file:

javascript:
var URL_RE = 'https?://[^ \\n]+[^ \\n.,;:?!&\'"’”)}\\]]';

Basically, the function

  1. Performs a naive calculation of the characters left by subtracting the field length from 140.
  2. Finds all the URLs in the field.
  3. Adjusts the naive calculation by adding back in the total length of all the unshortened URLs and subtracting SURL for each URL.

SURL is the standard shortened URL length, currently 19, soon to be bumped up to 20. It’s value is determined through this one-time inquiry in the $(document).ready function, done when Dr. Twoot is launched:

javascript:
406:    $.getJSON(CGI, {url:CONFIG_URL}, function(info) {
407:      SURL = info.short_url_length;
408:    });

The upshot is that the countdown value gives the number of characters left after shortening even though the tweet field is showing the URLs before shortening.

Dr. Twoot counting down with a URL

Notice how the countdown field is just 20 less than what it was before (1 for the space character after the period plus 19 for the prospective t.co shortened link), even though we’ve added way more than 20 characters to the field.

That takes care of shortening my tweets. To display shortened tweets the way Twitter does, I rewrote the htmlify function. In the past, it searched for things that looked like URLs and wrapped them in anchor tags. Now it uses the Tweet Entities that come with the tweet.

javascript:
34:  function htmlify(body, entities) {
35:    urls = entities.urls;
36:    users = entities.user_mentions;
37:    media = entities.media;
38:    
39:    // Handle links.
40:    $.each(urls, function(i, u) {
41:      if (u.display_url != null) {
42:        link = '<a href="' + u.expanded_url + '">' + u.display_url + '</a>';
43:      }
44:      else {
45:        link = '<a href="' + u.url + '">' + u.url + '</a>';
46:      }
47:      body = body.replace(u.url, link);
48:    }) // each
49:    
50:    // Handle Twitter names.
51:    $.each(users, function(i, u) {
52:      link = '<a href="http://twitter.com/' + u.screen_name + '">' + '@' + u.screen_name + '</a>';
53:      body = body.replace('@' + u.screen_name, link);
54:    }) // each
55:    
56:    // Handle media. For some reason, media is undefined rather than an empty
57:    // list, so we have to check before trying to loop through.
58:    // I've decided to comment out this whole thing until the media entity starts
59:    // getting used by people I follow.
60:    // if (typeof media != 'undefined') {
61:    //   $.each(media, function(i, u) {
62:    //     alert(body);
63:    //     if (u.media_url != null) {
64:    //       link = '<a href="' + u.expanded_url + '">' + '<img src="' + u.media_url + '"></a>';
65:    //     }
66:    //     else {
67:    //       link = '<a href="' + u.expanded_url + '">' + u.display_url + '</a>';
68:    //     }
69:    //     body = body.replace(u.url, link);
70:    //   }) // each
71:    // } // if
72:      
73:    // turn newlines into breaks
74:    body = body.replace(/\n/g, '<br />');
75:    return body;
76:  }

The display of URLs is handled in Lines 40-48. If a URL in the tweet has an associated display URL, show that and link it to the expanded URL. If a URL doesn’t have an associated display URL, show it and link it as-is.

Since Tweet Entities also include information about mentions—like @drdrang—I decided to have htmlify use that info to turn the mentions into links. That’s done in Lines 51-54, and I think the code pretty much speaks for itself.

The commented-out stuff in Lines 56-71 is for displaying images. Twitter now has “native” image sharing through a deal with Photobucket, and information about those images will be included in the Tweet Entities metadata under the “media” key. As best I can tell, no one I follow is using this new service; everyone is still linking to Twitpic or yFrog or Flickr. In theory, the code in Lines 56-71 will work, but since I haven’t been able to test it I figured it’d be safer to keep it behind comments.

You can see how Tweet Entities work in this screenshot:

Dr. Twoot with a display URL

Instead of showing me some http://t.co/blahblah gibberish, Dr. Twoot now shows a link that’s easy to interpret, just like the official Twitter clients do. And the link goes directly to the desired page instead of being passed through t.co first.

Overall, this was a pretty small rewrite for pretty big results. Both my followers and I get to take advantage of the cleaner display URLs, and I still have an accurate character countdown as I compose my tweets. I’m a little bummed that some of my old TextExpander snippets will be put out to pasture, but it’s a small price to pay for making Dr. Twoot look and act more like a professionally written application.