Navigating Mathematica with Keyboard Maestro

If you think I’ve written too much about Mathematica recently, don’t worry. This post discusses something in Mathematica that annoys me, but it isn’t really about Mathematica. It’s about how I eliminated that annoyance with Keyboard Maestro and how I should have done so long ago.

What I hate about Mathematica is navigating in its help window. Like Safari, the Mathematica help window has backward and forward navigation arrows up near the top left,

Mathematica documentation window

but unlike Safari, it doesn’t use the standard keyboard shortcuts of ⌘[ and ⌘] to navigate. Wolfram’s documentation says you can use “Alt ←” and “Alt →” to navigate, but those don’t work.1 If there’s a cursor blinking somewhere in the window, ⌥← and ⌥→ move the cursor left and right, which is standard Mac cursor control. If there’s no active cursor in the help window, ⌥← and ⌥→ do nothing.

And even if Mathematica worked the way its documentation says it works, and it overrode the standard behavior of ⌥← and ⌥→, I would never use them for navigation because my fingers don’t think that way. They want to use ⌘[ and ⌘] because that’s what Safari has trained them to use (also Firefox and Chrome).

Recently, during a particularly intense session with Mathematica’s help system, I got angry with myself. Why did I keep hitting ⌘[ and ⌘] and getting angry with Mathematica? Yes, Wolfram should adhere to Mac conventions, but you’re not powerless. Stop getting angry and fix the problem!

So I opened Keyboard Maestro and had a pair of one-step macros written in a few minutes. Here’s Help Back,

KM Help Back

and here’s Help Forward,

KM Help Forward

They’re saved in Keyboard Maestro’s Wolfram group, which means they work only when Mathematica is the active application.2

Initially, I tried to use Keyboard Maestro’s Click at Found Image action, but that didn’t work consistently. Apparently, Mathematica’s arrow buttons are too anemic for Keyboard Maestro to find reliably. So I switched to Move or Click Mouse and all was right with the world.

I should mention that the coordinates of the click with respect to the upper left corner of the window were determined with the help of the Get button. Click it and you have a five-second countdown (with beeps) to bring the appropriate window to the front and position the mouse where you want the click to happen. The coordinates fill in automatically. It’s one of those features that users love about Keyboard Maestro.

Now when my fingers press ⌘[ and ⌘], Mathematica’s help window responds the way it should, and I don’t get angry anymore. Except a little bit at myself for not doing this ages ago.


  1. Normally, this is where I’d complain that Macs don’t have Alt keys, they have Option keys, but I’m going to let that pass. See how I’m letting that pass? 

  2. Wolfram has really messed up its naming. The documentation calls the app either “Mathematica” or “Wolfram Mathematica,” depending on what part of the documentation you’re looking at. My account says I have a license for “Wolfram Mathematica.” The app’s icon and the menu bar when the app is running say it’s “Wolfram.” Keyboard Maestro goes with the name in the menu bar. 


Dot plots in Mathematica

I ran across Mathematica’s NumberLinePlot function recently and wondered if I could use it to make dot plots. The answer was yes, but not directly.

By “dot plots,” I mean the things that look sort of like histogram but with stacked dots instead of boxes to represent each count. Here’s an example from a nice paper on dot plots by Leland Wilkinson (the Grammar of Graphics guy):

Dot plot from Wilkinson

The paper is referenced in the Wikipedia article on dot plots, and rightly so. Here’s another in a slightly different style from Statistics for Experimenters by Box, Hunter, and Hunter (who call them dot diagrams):

Dot diagram from BHH

(The vertical line and arrow are add-ons to the dot plot pertinent to the particular problem BH&H were discussing.)

If you’re a William Cleveland fan, you should be aware that the kind of dot plots I’ll be talking about here are not what he calls dot plots. His dot plots are basically bar charts with dots where the ends of the bars would be. Like this example from The Elements of Graphing Data:

Dot plot from Cleveland

With that introduction out of the way, let’s talk about Wilkinson-style dot plots and NumberLinePlot.

In a nutshell, NumberLinePlot takes a list of numbers and produces a graph in which the numbers are plotted as dots along a horizontal axis—very much like the number lines I learned about in elementary school. I wondered what it would do if some of the numbers were repeated, so I defined this list of 30 integers with a bunch of repeats,

a = {7, 2, 4, 8, 5, 7, 4, 2, 1, 3, 7, 5, 7, 8, 7,
     2, 5, 6, 7, 6, 1, 5, 2, 1, 6, 5, 8, 3, 8, 3}

and ran NumberLinePlot[a]. Here’s the output:

Unstyled NumberLinePlot of flat list

Unfortunately, it plots the repeated numbers on top of one another so you can’t see how many repeats there are. I looked through the documentation to see if I could pass an option to NumberLinePlot to get it to stack the repeated dots vertically, but no go.

There is, however, a way to get NumberLinePlot to stack dots. You have to pass it a nested list set up with sublists having no repeated elements. For example, if we rewrite our original flat list, a, this way,

b = {{1, 2, 3, 4, 5, 6, 7, 8}, {1, 2, 3, 4, 5, 6, 7, 8},
     {1, 2, 3, 5, 6, 7, 8}, {2, 5, 7, 8}, {5, 7}, {7}}

and run NumberLinePlot[b], we get this graph:

Unstyled NumberLinePlot of nested list

As you can see, each sublist is plotted as its own row. That’s why each row has a different color: Mathematica sees this as a combination of six number line plots and gives each its own color by default. But don’t worry about the colors or the size and spacing of the dots; those can all be changed by passing options to NumberLinePlot. What you should worry about is how to turn flat list a into nested list b. Doing it by hand is both tedious and prone to error.

As it happens, someone has already worked that out and combined it with NumberLinePlot to make an external function, DotPlot, that makes the kind of graph we want. Running ResourceFunction["DotPlot"][a] produces

DotPlot output without styling

Again, we can add options to DotPlot to give it the style Wilkinson suggests. Calling it this way,

ResourceFunction["DotPlot"][a, Ticks -> {Range[8]}, 
   AspectRatio -> .2, PlotRange -> {.5, 8.5},
   PlotStyle -> Directive[Black, PointSize[.028]]]

gives us

Dot plot with styling

After setting the aspect ratio to 1:5, I had to fiddle around with the PointSize to get the dots to sit on top of one another. It didn’t take long.

This is great, but I wanted to know what DotPlot was doing behind the scenes. So I downloaded its source notebook and poked around. The code wasn’t especially complicated, but there was one very clever piece. Let’s go through the steps.

First, we Sort the original flat list and Split it into a list of lists in which each sublist consists of a single repeated value.

nest = Split[Sort[a]]

yields

{{1, 1, 1}, {2, 2, 2, 2}, {3, 3, 3}, {4, 4}, {5, 5, 5, 5, 5},
 {6, 6, 6}, {7, 7, 7, 7, 7, 7}, {8, 8, 8, 8}}

Compare this with the nested list b we used above and you can see that they’re sort of opposites. If we think of these nested lists as matrices, the rows of b are the columns of nest, and vice versa. That suggests a Transpose of nest would get us to b, but the problem is that neither nest nor b are full matrices. They both have rows of different lengths.

The solution to this is a three-step process:

  1. Pad the rows of nest with dummy values so they’re all the same length.
  2. Transpose that padded list of lists.
  3. Delete the dummy values from the transpose.

The code for the first step is

maxlen = Max[Map[Length, nest]];
padInnerLists[l_] := PadRight[l, maxlen, "x"]
padded = Map[padInnerLists, nest]

This sets maxlen to 6 and pads all of nest’s rows out to that length by adding x strings. The result is this value for padded:

{{1, 1, 1, "x", "x", "x"}, {2, 2, 2, 2, "x", "x"},
 {3, 3, 3, "x", "x", "x"}, {4, 4, "x", "x", "x", "x"},
 {5, 5, 5, 5, 5, "x"}, {6, 6, 6, "x", "x", "x"},
 {7, 7, 7, 7, 7, 7}, {8, 8, 8, 8, "x", "x"}}

I should mention here that Map works the same way mapping functions work in other languages: it applies function to each element of a list and returns the list of results.

Now we can transpose it with

paddedT = Transpose[padded]

which yields

{{1, 2, 3, 4, 5, 6, 7, 8}, {1, 2, 3, 4, 5, 6, 7, 8},
 {1, 2, 3, "x", 5, 6, 7, 8}, {"x", 2, "x", "x", 5, "x", 7, 8},
 {"x", "x", "x", "x", 5, "x", 7, "x"},
 {"x", "x", "x", "x", "x", "x", 7, "x"}}

The last step is to use DeleteCases to get rid of the dummy values.

nestT =  DeleteCases[paddedT, "x", All]

yields

{{1, 2, 3, 4, 5, 6, 7, 8}, {1, 2, 3, 4, 5, 6, 7, 8},
 {1, 2, 3, 5, 6, 7, 8}, {2, 5, 7, 8}, {5, 7}, {7}}

which is exactly what we want. This is what DotPlot does to prepare the original flat list for passing to NumberLinePlot.1

Now that I know how DotPlot works, I feel comfortable using it.

You may be wondering why I don’t just run

Histogram[a, {1}, Ticks -> {Range[8]}]

to get a histogram that presents basically the same information.

Histogram of flat list

First, where’s the fun in that? Second, dot plots appeal to my sense of history. They have a kind of hand-drawn look that reminds me of using tally marks to count items in the days before we all used computers.


  1. DotPlot uses anonymous functions and more compact notation than I’ve used here. But the effect is the same, and I wanted to avoid weird symbols like #, &, and /@


Baseball durations after the pitch clock

A couple of years ago, after Major League Baseball announced they’d start using a pitch clock, I said I would write a post about the change in the duration of regular season games. I didn’t. My recent post about Apple’s Sports app and its misunderstanding of baseball standings reminded me of my broken promise. I decided to write a new game duration post after the 2025 season ended.

Unfortunately, Retrosheet, which is where I get my data, hasn’t published its 2025 game logs yet. I decided to go ahead with the post anyway, mainly because putting it off would probably lead to my forgetting again. So here’s the graph of regular season durations through the 2024 season. When the 2025 data comes in, I’ll update it (I hope).

Baseball game durations

The black line is the median duration, and the light blue zone behind it is the interquartile range. The middle 50% of games lie within that range.

Clearly, there was a huge dropoff in 2023, so I’d say the pitch clock was a big success. The further reduction in 2024 was within the historical year-to-year duration change—I wouldn’t attribute any significance to it.

Games are now roughly as long as they were in the early ’80s, so the powers at MLB have cut about four decades of fat from the game. And they did it without reducing the number of commercials, because they’d never do that.

I made the graph more or less the same way I made the last one, although I combined some of the steps into a single script. First, I downloaded and unzipped all the yearly game logs since 1920 from Retrosheet. They have names like gl1920.txt. Then I converted the line endings from DOS-style to Unix-style via

dos2unix gl*.txt

I got the dos2unix command from Homebrew.

I concatenated all the data into one big (189 MB) file using

cat gl*.txt > gl.csv

Although the files from Retrosheet have a .txt extension, they’re actually CSV files (albeit without a header line). That’s why I gave the resulting file a .csv extension.

I then ran the following script, which uses Python and Pandas to make a dataframe with just the columns I want from the big CSV file and calculate the quartile values for game durations on a year-by-year basis.

python:
 1:  #!/usr/bin/env python3
 2:  
 3:  import pandas as pd
 4:  from scipy.stats import scoreatpercentile
 5:  import sys
 6:  
 7:  # Make a simplified dataframe of all games with just the columns I want
 8:  cols = [0, 3, 4, 6, 7, 9, 10, 18]
 9:  colnames = 'Date VTeam VLeague HTeam HLeague VScore HScore Time'.split()
10:  df = pd.read_csv('gl.csv', usecols=cols, names=colnames, parse_dates=[0])
11:  
12:  # Add a column for the year
13:  df['Year'] = df.Date.dt.year
14:  
15:  # Use the dataframe created above to make a new dataframe
16:  # with the game duration quartiles for each year
17:  cols = 'Year Q1 Median Q3'.split()
18:  dfq = pd.DataFrame(columns=cols)
19:  for y in df.Year.unique():
20:    p25 = scoreatpercentile(df.Time[df.Year==y], 25)
21:    p50 = scoreatpercentile(df.Time[df.Year==y], 50)
22:    p75 = scoreatpercentile(df.Time[df.Year==y], 75)
23:    dfq.loc[y] = [y, p25, p50, p75]
24:  
25:  # Write a CSV file for the yearly duration quartiles
26:  dfq.Year = dfq.Year.astype('int32')
27:  dfq.to_csv('gametimes.csv', index=False)

This code is very similar to what I used a couple of years ago. You can read that post if you want an explanation of any of the details.

Now I had a new CSV file, called gametimes.csv, that looked like this

Year,Q1,Median,Q3
1920,99.0,109.5,120.0
1921,100.0,111.0,125.0
1922,100.0,110.5,124.0
1923,102.0,112.0,125.0
1924,101.0,112.0,125.0
1925,102.0,114.0,127.0
[and so on]

The graph was made by running this script, which reads the data from gametimes.csv and uses Matplotlib to output the PNG image shown above:

python:
 1:  #!/usr/bin/env python3
 2:  
 3:  import pandas as pd
 4:  import matplotlib.pyplot as plt
 5:  from datetime import date
 6:  
 7:  # Import game time data
 8:  df = pd.read_csv('gametimes.csv')
 9:  
10:  # Create the plot with a given size in inches
11:  fig, ax = plt.subplots(figsize=(6, 4))
12:  
13:  # Add the interquartile range and the median
14:  plt.fill_between(df.Year, df.Q1, df.Q3, alpha=.25, linewidth=0, color='#0066ff')
15:  ax.plot(df.Year, df.Median, '-', color='black', lw=2)
16:  
17:  # Gridlines and ticks
18:  ax.grid(linewidth=.5, axis='x', which='major', color='#bbbbbb', linestyle='-')
19:  ax.grid(linewidth=.5, axis='y', which='major', color='#bbbbbb', linestyle='-')
20:  ax.tick_params(which='both', width=.5)
21:  
22:  # Title and axis labels
23:  plt.title('Baseball game durations')
24:  plt.xlabel('Year')
25:  plt.ylabel('Minutes per game')
26:  
27:  # Save as PNG
28:  day = date.today()
29:  dstring = f'{day.year:4d}{day.month:02d}{day.day:02d}'
30:  plt.savefig(f'{dstring}-Baseball game durations.png', format='png', dpi=200)

I don’t know when the folks at Retrosheet will add the 2025 data. Maybe after the World Series. I’ll check back then and update the post if there’s a new year of game logs.


Sparking joy

I watched this Numberphile video a few days ago and learned not only about Harshad numbers, which I’d never heard of before, but also some new things about defining functions in Mathematica.

Harshad numbers1 are defined as integers that are divisible by the sum of their digits. For example,

42=424+2=426=7

so 42 is a Harshad number. On the other hand,

76=767+6=7613

so 76 is not. You can use any base to determine whether a number is Harshad or not, but I’m going to stick to base 10 here.

After watching the video, I started playing around with Harshad numbers in Mathematica. Because Mathematica has both a DigitSum function and a Divisible function, it’s fairly easy to define a Boolean function that determines whether a number is Harshad or not:

myHarshadQ[n_] := Divisible[n, DigitSum[n]]

Mathematica functions that return Boolean values often end with Q (Divisible being an obvious exception), so that’s why I put a Q at the end of myHarshadQ. A couple of things to note:

To get all the Harshad numbers up through 100, we can use the Select and Range functions like this:

Select[Range[100], myHarshadQ]

which returns the list

{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12,
 18, 20, 21, 24, 27, 30, 36, 40, 42, 45, 48,
 50, 54, 60, 63, 70, 72, 80, 81, 84, 90, 100}

You can check this against Sequence A005349 in the On-Line Encyclopedia of Integer Sequences.

There are a couple of ways to count how many Harshad numbers are in a given range. Either

Length[Select[Range[100], myHarshadQ]]

which is a straightforward extension of what we did above, or the less obvious

Count[myHarshadQ[Range[100]], True]

which works by making a list of 100 Boolean values and counting how many are True. The Count function is like Length, but it returns only the number of elements in a list that match a pattern. Both methods return the correct answer of 33, and they take about the same amount of time (as determined through the Timing function) to run.

While Mathematica doesn’t have a built-in function for determining whether a number is Harshad, it does have an external function, HarshadNumberQ, that does. This can be used in conjunction with ResourceFunction. The call

ResourceFunction["HarshadNumberQ"][42]

returns True. Similarly,

Select[Range[100], ResourceFunction["HarshadNumberQ"]]

returns the list of 33 numbers given above.

The resource function is considerably faster than mine. Running

Timing[Count[ResourceFunction["HarshadNumberQ"][Range[10000]], True]]

returns {0.027855, 1538}, while running

Timing[Count[myHarshadQ[Range[10000]], True]]

returns {0.06739, 1538}. The runtime, which is the first item in the list, changes slightly from one trial to the next, but myHarshadQ always takes nearly three times as long.

To figure out why, I downloaded the source notebook for HarshadNumberQ and took a look. Here’s the source code for the function:

SetAttributes[HarshadNumberQ, Listable];
iHarshadNumberQ[n_, b_] := Divisible[n, Total[IntegerDigits[n, Abs[b]]]]
HarshadNumberQ[n_Integer, b_Integer] := 
  With[{res = iHarshadNumberQ[Abs[n], b]}, res /; BooleanQ[res]]
HarshadNumberQ[n_Integer] := HarshadNumberQ[n, 10];
HarshadNumberQ[_, Repeated[_, {0, 1}]] := False

I won’t pretend to understand everything that’s going on here, but I have figured out a few things:

First, HarshadNumberQ is obviously defined to handle any number base, not just base-10. And it can handle a list of bases, too, so if you want to check on how many bases in which a number is Harshad, this is the function you want.

Second, HarshadNumberQ uses an argument pattern I’ve never seen before: n_Integer. This restricts it to accepting only integer inputs. It returns False when given noninteger arguments; it doesn’t just throw an error the way myHarshadQ does.

Third, HarshadNumberQ uses Total[IntegerDigits[n]] instead of DigitSum[n]. This, I believe, is at least part of the reason it runs faster than myHarshadQ. For example,

Timing[Total[IntegerDigits[99999^10]]]

returns {0.000084, 216}, while

Timing[DigitSum[99999^10]]

returns {0.000146, 216}.

Finally, the SetAttributes function is needed because Total would otherwise have trouble handling the output of IntegerDigits when the first argument is a list. Let’s look at some examples:

IntegerDigits[123456]

returns {1, 2, 3, 4, 5, 6}, a simple list that Total knows how to sum to 21. But

IntegerDigits[{12, 345, 6789}]

returns a lists of lists, {{1, 2}, {3, 4, 5}, {6, 7, 8, 9}}, which Total has trouble with. Calling

Total[IntegerDigits[{12, 345, 6789}]]

returns an error, saying that lists of unequal length cannot be added. We can get around this error by giving Total a second argument. Calling it as

Total[IntegerDigits[{12, 345, 6789}], {2}]

tells Total to add the second-level lists, returning {3, 12, 30}, which is just what we want. The problem is that this second argument to leads to an error when the first argument is a scalar instead of a list.

This, as best I can tell, is where the

SetAttributes[HarshadNumberQ, Listable];

line comes in. By telling Mathematica that HarshadNumberQ is Listable, we can define the function as if Total were always being used on scalars, and the system will thread over lists just like we want.

Since I’m just writing myHarshadQ for my own entertainment and not for others to use, I can take some of what I learned above and rewrite it to run faster. Like this:

SetAttributes[myHarshadQ, Listable]; 
myHarshadQ[n_] := Divisible[n, Total[IntegerDigits[n]]]

With this new definition, myHarshadQ runs considerably faster. Calling

Timing[Count[myHarshadQ[Range[10000]], True]]

returns {0.013527, 1538}. This is roughly twice as fast as HarshadNumberQ, probably because the definition doesn’t deal with other bases or handle errors gracefully.

Now I can look at Harshad numbers over a broad range without having to wait long for results. To see how they’re distributed over the first million numbers, I ran

Histogram[Select[Range[999999], HarshadNumberQ], {25000}, ImageSize -> Large]

and got this histogram (with a bin width of 25,000):

Harshad distribution histogram

The stepdown pattern repeats every 100,000 numbers, moving down each time. Let’s stretch out the range to three million and see what happens.

Histogram[Select[Range[2999999], HarshadNumberQ], {25000}, ImageSize -> Large]

Harshad distribution histogram over 3 million

The pattern repeats at the larger scale, too.

If you want to learn some other Harshad facts, you should watch this extra footage at Numberphile. There doesn’t seem to be anything directly practical you can do with Harshad numbers, but you can use them to exercise your mathematical muscles. Or learn new things about the Wolfram Language.


  1. You might think Harshad numbers are named after their discoverer or, in accordance with Stigler’s Law, their popularizer, but no. See either the video or the Wikipedia article for the origin of the name.