Batch comparison of Git repositories

On of the problems I have with Git is the nagging sense that I’m always doing things the hard way, that a Git guru would look at my workflow and laugh at the naiveté. And so it is with some trepidation that I present a little shell script I use to see if the local copies of my repositories match the copies at GitHub.

As best I can tell, the git diff command is appropriate way to see if two repositories match up. To compare the local with the remote, you execute something like

git diff HEAD origin/HEAD

from within the local repository. What I wanted was a single command that would do this for all my repositories. My Googling came up empty, so I wrote a little shell script to do the job.

The script, called “github-compare,” takes advantage of the way my repositories are organized. They’re all inside a git directory in my home directory. In essence, then, all the script has to do is loop through the subdirectories of ~/git/ and run the above command. And apart from some code to keep the output simple and clean, that’s what the script does.

 1:  #!/bin/bash
 3:  GIT=/usr/local/git/bin/git
 5:  for d in ~/git/*; do
 6:    cd $d
 7:    if [ -z "`$GIT diff HEAD origin/HEAD 2> /dev/null`" ]; then
 8:      echo "Same ->" `basename $d`
 9:    else
10:      echo "Different ->" `basename $d`
11:    fi
12:  done

Line 3 defines the Git executable. The loop that starts on Line 5 goes into all the subdirectories of ~/git/, runs the git diff command, and prints out “Same” or “Different” depending on the result. To keep the script’s output simple, I don’t actually print any of the diff output; STDERR messages are sent to /dev/null, and STDOUT is just checked to see if it’s of zero length (that’s the -z test in Line 7).

Here’s some sample output:

Same -> blackbirdpy
Same -> blog-preview
Same -> bwana
Same -> drtwoot
Same -> jumpcut
Same -> mail-preview
Different -> php-markdown-extra-math
Same -> php-smartypants
Same -> wp-drang-theme

The purpose of the script isn’t to say how the local repo differs from the the remote one, only to tell me if it’s different. Once I know which, if any, are out of sync, I can look at those that differ in greater detail.

I await criticisms and suggestions from those who really know what they’re doing.

Update 12/28/10
Academic twitterers have come to my aid.

First, Allen MacKenzie (@mackenab) of Virginia Tech, points me to GitGot, a Perl module and command line tool by John SJ Anderson that does a lot of Gittish things on groups of repositories all at once instead of one repo at a time. The command that most matches my little shell script is got status, which generates output like this,

1) blog-preview             : OK 
2) drtwoot                  : OK Ahead by 1
3) php-markdown-extra-math  : OK

which is both compact and more informative than my script’s output.

Sadly, the got output is both garishly colored (which I’ve spared you in the example above) and designed to be used on a terminal emulator with a black background. As is, I couldn’t stand to use it on a regular basis (I’m quite sensitive, you know). The color problem may be easy to overcome; I haven’t had a chance to look at the source code yet.

Next, Mark Eli Kalderon (@PhilGeek) of University College London tells me about the --shortstat option to git diff, the inclusion of which in my script would add information to its output without making it much longer. An example of output using --shortstat is

2 files changed, 9 insertions(+), 9 deletions(-)

I’ll have to think about a clean way to include this info.

I can’t mention UCL without commenting on this article in yesterday’s NY Times. It’s about crowdsourcing the transcription of handwritten historical documents, in particular a project at UCL to transcribe the works of Jeremy Bentham. The article is interesting, but what really caught my eye was this photograph and the accompanying caption.

Bentham's auto-icon

Jeremy Bentham’s preserved, clothed corpse, with a waxworks replica of his actual head at bottom, has greeted visitors to the college since 1850.

According to Wikipedia, the real head is locked away to prevent vandalism and “student pranks.”

Mark tells me that Bentham himself called his preserved corpse—before the fact, one assumes—the “auto-icon” and that’s how UCL refers to it.

Really makes me wish I’d known about this several years ago when we took a family vacation in London. It would have made a great addition to our Bloomsbury/British Museum day.

Update 12/28/10
John SJ Anderson (@genehack) is a nice guy, and GitGot is a decent tool, but I really wish I hadn’t installed it. The chain of CPAN module dependencies necessary to install GitGot was a mile long, and now I have dozens of Perl modules installed that I’ll never use. Worse, the cpan utility destroyed itself, and now I have a cpan5.10.0 and a cpan5.8.9 instead. The perldoc command is similarly split, all because I did the GitGot installation using sudo. Feh.

Updated update
Repairing disk permissions through Disk Utility restored both cpan and perldoc. Thanks to this SuperUser article for the suggestion. It also suggests using Pacifist and the system disk to get back to the original Snow Leopard version of Perl. I don’t think I’m brave enough to try that.