My life as a Twitter spammer

Something weird is going on with Twitter’s shortened links. I don’t know how it’s happening, but there must be several people on Twitter who think I’m using fake accounts to spam them with links to one of my posts.

I get many visits from people following links from the t.co domain, Twitter’s own link shortening service. That isn’t unusual. Like many bloggers, I often tweet a link to a post I’ve just written. But when a post I didn’t tweet about starts getting hits from t.co, it means someone else has tweeted a link to me. Obviously, if my Twitter handle is included in the tweet, there’s no mystery; I’ll probably already have seen the link in my mentions. Sometimes, though, the link comes without a mention, and if I have the time I try to find out who did it.

Yesterday morning I was home from work, trying to fight off the beginning of a cold (unsuccessfully), so I had the time. I went to twitter.com and searched for “leancrew.” Twitter initially shows the “top” tweets, using whatever algorithm it has for determining “top,” but you can switch to “all” tweets from a little popup menu near the top of the page.

Tweet search results options

When I did that, a bunch of obvious spam appeared in the results.

Twitter spam in search results

“Leancrew” didn’t appear in any of these tweets, but by hovering over the link I could see the URL of one of my posts in the popup.

Spam link to me

WTF? You won’t be surprised to learn, I’m sure, that clicking those link takes you not to my post but to a site that purports to be giving away iPads. So why does the little popup show one of my URLs?

My understanding of that popup is that it’s supposed to show the resolved URL of the link, that is, the URL at the end of the redirect chain. For example, if you just post a regular, direct URL into your tweet, Twitter will turn it into a short t.co link which redirects to the original. That’s one step of redirection. If, however, you paste a j.mp link into your tweet, Twitter wraps that into a t.co link before posting and you now have two steps of redirection: from t.co to j.mp and then from j.mp to the original site.

Some of this information is included in the tweet entities section of the tweet. For example, let’s look at this non-spam tweet:

BoingBoing: Lego Moleskine notebooks j.mp/AlbEFV
  — macdrifter (@macdrifter) Sat Jan 28 2012

By going to Twitter’s Exploring the Twitter API page we can see what this tweet is composed of by issuing a GET on the URL

https://api.twitter.com/1/statuses/show.json?id= 163262396130017281&include_entities=true

In the JSON that’s returned, the entities part looks like this:

"entities":  {
   "user_mentions":  [],
   "urls":  [
      {
       "url": "http://t.co/2ukW7OSE",
       "display_url": "j.mp/AlbEFV",
       "indices":  [
         37,
         57
       ],
       "expanded_url": "http://j.mp/AlbEFV"
     }
   ],
   "hashtags":  []
 },

The url is the t.co-shortened link that Twitter made, the expanded_url is the j.mp URL you pasted into the tweet,1 and display_url is the text that appears in the tweet.

If you look at this tweet on its own page, you’ll see that the popup shows the expanded_url.

Popup on individual tweet page

But if you look at it in search results or a timeline, the popup shows the resolved URL that j.mp redirects to.

Popup in timeline

So what does this have to do with the spam tweets I saw? Well, if I GET one of the spams, the entities section is

"entities":  {
  "urls":  [
     {
      "url": "http://t.co/mNL7S05v",
      "display_url": "hi24.info/c1b",
      "expanded_url": "http://hi24.info/c1b",
      "indices":  [
        15,
        35
      ]
    }
  ],
  "user_mentions":  [
     {
      "id_str": "421544542",
      "name": "Sifri y Quique",
      "screen_name": "quiquesifri96",
      "indices":  [
        0,
        14
      ],
      "id": 421544542
    }
  ],
  "hashtags":  []
},

which is very much like @macdrifter’s tweet, with hi24.info URLs taking the place of the j.mp URLs.2

Somehow, then, when Twitter resolves that hi24.info URL to make the popup, it resolves to my post; but when a user clicks on it in a browser, it resolves to the spam page offering free iPads.

There’s actually quite a long chain of redirects associated with that hi24.info URL. When I pass it to wget, I get this reponse:

$ wget http://hi24.info/c1b
--2012-01-28 09:04:41--  http://hi24.info/c1b
Resolving hi24.info... 173.208.182.10
Connecting to hi24.info|173.208.182.10|:80... connected.
HTTP request sent, awaiting response... 302 Moved Temporarily
Location: http://oldcarreviews.info/index.php?t=1327316945403 [following]
--2012-01-28 09:04:41--  http://oldcarreviews.info/index.php?t=1327316945403
Resolving oldcarreviews.info... 66.85.156.14
Connecting to oldcarreviews.info|66.85.156.14|:80... connected.
HTTP request sent, awaiting response... 302 Moved Temporarily
Location: http://google.com/ [following]
--2012-01-28 09:04:43--  http://google.com/
Resolving google.com... 74.125.225.147, 74.125.225.148, 74.125.225.144, ...
Connecting to google.com|74.125.225.147|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: http://www.google.com/ [following]
--2012-01-28 09:04:44--  http://www.google.com/
Resolving www.google.com... 74.125.225.114, 74.125.225.115, 74.125.225.116, ...
Connecting to www.google.com|74.125.225.114|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]

To me, the weirdest thing is after redirecting to oldcarrierreviews.info, it redirects to google.com (twice). I don’t understand how a google.com request ends up at a spam page.

Anyway, to bring this rambling post to a close, my guess is that h124.info uses the headers of the HTTP request to figure out whether it’s getting an inquiry from Twitter or from a normal browser. If it’s from Twitter, it returns an inoffensive URL (like mine); if it’s from a browser (or wget), it sets off the chain of redirects that lands at the spam page.

I’m kind of pissed at this. I don’t want my domain associated with spam, but I don’t know how to stop it. It’s similar to when email spam spoofs your address, but I think most people know by now that email addresses can be spoofed. Twitter’s stated reason for wrapping all links in t.co redirects is to reduce spam and keep us safe. In this case, not only is the spam getting through (the @bandurazqoeb3 and @klughadmq0 accounts are still active as I write this), but the users getting these spams would be justified in thinking I have something to do with them.

I tweeted a complaint to Twitter’s @spam account but got no response. When I’m done posting this, I’ll go through the steps in this security form to report the issue. But I don’t see my career as an inadvertent Twitter spammer ending soon.


  1. Or maybe you pasted in a long URL and your Twitter client turned it into a j.mp link for you. The point is that Twitter received the j.mp URL. 

  2. And of course the spam has a user mention—it wouldn’t be Twitter spam without that. 


5 Responses to “My life as a Twitter spammer”

  1. Ben K says:

    The systemic solution to all this is to eliminate all of the assinine redirection that Twitter forces on its users. Having to depend on the DNS authority of Colombia and the Northern Mariana Islands simply to convey your damn URL is fundamentally ridiculous.

    Moreover, really, the problem with Twitter is that it’s a commercial and closed single-point enterprise, rather than a system of distributed design (like Jabber, SMTP, etc.). But that’s a whole other rant.

    b

  2. Dr. Drang says:

    Given that Twitter is almost certainly the biggest promulgator of shortened links, it makes perfect sense for it to have its own URL shortener. I don’t even object to all URLs in tweets being passed through it.

    What bothers me here is Twitter trying to be clever and displaying what it thinks is the ultimate URL for the h124.info link. When it gets fooled by a spammer, it passes the deceit along to its users and makes me (and, I’m sure, many other innocent parties) look like accessories. If the popup simply showed the expanded_url, users would know they’re taking a chance with a shortened URL that could take them anywhere.

    The samples I found were, of course, obvious spam. But whatever h124.info is doing could be applied to more sophisticated tweets that are written to look like legitimate messages.

  3. Ben K says:

    Right. But if Twitter didn’t unequivocally adulterate everything resembling a URL that is pasted into a tweet, this whole discussion would be moot.

    I suppose your explanation for Twitter’s own (Colombia-based) shortener is somewhat reasonable, but then again, I don’t understand the service’s basic justification. After all, the purpose of the shortener is to reduce imposition against the 140-character display limit. But since Twitter is now storing the real URL behind the scenes, why can’t the client simply link to it directly? What value does the “t.co” domain and its intermediate redirect bring to the table?

    b

  4. Dr. Drang says:

    Clients can link to it directly. That’s what I do in Dr. Twoot. Twitter’s web site doesn’t, probably because it wants to track clicks.

    I assume Twitter started using t.co because:

    1. It wants to maintain the 140 character limit. This is necessary to send tweets via text message, although I have to believe very few people interact with Twitter that way anymore. More important, though, is that the limit is part of Twitter’s character. It wouldn’t be Twitter with longer messages.
    2. Other shorteners had stepped into the vacuum and were getting the “business” Twitter felt should be its own.
    3. It has some vague notion of monetizing t.co in the future.

    Personally, I think using a 3rd-party shortener now that Twitter shortens all URLs is foolish and counterproductive. If you use bit.ly or j.mp you are responsible for the extra redirection, not Twitter.

    That said, I think Twitter could simply count URLs as 20 characters (regardless of their actual length) and send the full URL out everywhere except in text messages. As you say, the full URL is being sent in most cases anyway.

  5. Ben K says:

    I agree with you that using any third-party redirector is foolish, fragile, and contrary to the purpose of URLs as uniform resource locators. May the earned wrath be upon those who employ them!

    140-char tweets don’t fit into SMS messages, and probably never did, unless the Twitter alphabet were limited to 7-bit ASCII. (Maybe it was at one time; I haven’t looked into it.) Twitter apparently accepts up to 140 characters of Unicode, while an SMS message encoded in UTF-16 is limited to 70 chars—and that doesn’t consider the prepended boilerplate like “@soandso: ” before the message body.

    b