Navigating Mathematica with Keyboard Maestro
October 3, 2025 at 4:23 PM by Dr. Drang
If you think I’ve written too much about Mathematica recently, don’t worry. This post discusses something in Mathematica that annoys me, but it isn’t really about Mathematica. It’s about how I eliminated that annoyance with Keyboard Maestro and how I should have done so long ago.
What I hate about Mathematica is navigating in its help window. Like Safari, the Mathematica help window has backward and forward navigation arrows up near the top left,
but unlike Safari, it doesn’t use the standard keyboard shortcuts of ⌘[ and ⌘] to navigate. Wolfram’s documentation says you can use “Alt ←” and “Alt →” to navigate, but those don’t work.1 If there’s a cursor blinking somewhere in the window, ⌥← and ⌥→ move the cursor left and right, which is standard Mac cursor control. If there’s no active cursor in the help window, ⌥← and ⌥→ do nothing.
And even if Mathematica worked the way its documentation says it works, and it overrode the standard behavior of ⌥← and ⌥→, I would never use them for navigation because my fingers don’t think that way. They want to use ⌘[ and ⌘] because that’s what Safari has trained them to use (also Firefox and Chrome).
Recently, during a particularly intense session with Mathematica’s help system, I got angry with myself. Why did I keep hitting ⌘[ and ⌘] and getting angry with Mathematica? Yes, Wolfram should adhere to Mac conventions, but you’re not powerless. Stop getting angry and fix the problem!
So I opened Keyboard Maestro and had a pair of one-step macros written in a few minutes. Here’s ,
and here’s
,They’re saved in Keyboard Maestro’s 2
group, which means they work only when Mathematica is the active application.Initially, I tried to use Keyboard Maestro’s
action, but that didn’t work consistently. Apparently, Mathematica’s arrow buttons are too anemic for Keyboard Maestro to find reliably. So I switched to and all was right with the world.I should mention that the coordinates of the click with respect to the upper left corner of the window were determined with the help of the
button. Click it and you have a five-second countdown (with beeps) to bring the appropriate window to the front and position the mouse where you want the click to happen. The coordinates fill in automatically. It’s one of those features that users love about Keyboard Maestro.Now when my fingers press ⌘[ and ⌘], Mathematica’s help window responds the way it should, and I don’t get angry anymore. Except a little bit at myself for not doing this ages ago.
-
Normally, this is where I’d complain that Macs don’t have Alt keys, they have Option keys, but I’m going to let that pass. See how I’m letting that pass? ↩
-
Wolfram has really messed up its naming. The documentation calls the app either “Mathematica” or “Wolfram Mathematica,” depending on what part of the documentation you’re looking at. My account says I have a license for “Wolfram Mathematica.” The app’s icon and the menu bar when the app is running say it’s “Wolfram.” Keyboard Maestro goes with the name in the menu bar. ↩
Dot plots in Mathematica
October 2, 2025 at 5:52 PM by Dr. Drang
I ran across Mathematica’s NumberLinePlot
function recently and wondered if I could use it to make dot plots. The answer was yes, but not directly.
By “dot plots,” I mean the things that look sort of like histogram but with stacked dots instead of boxes to represent each count. Here’s an example from a nice paper on dot plots by Leland Wilkinson (the Grammar of Graphics guy):
The paper is referenced in the Wikipedia article on dot plots, and rightly so. Here’s another in a slightly different style from Statistics for Experimenters by Box, Hunter, and Hunter (who call them dot diagrams):
(The vertical line and arrow are add-ons to the dot plot pertinent to the particular problem BH&H were discussing.)
If you’re a William Cleveland fan, you should be aware that the kind of dot plots I’ll be talking about here are not what he calls dot plots. His dot plots are basically bar charts with dots where the ends of the bars would be. Like this example from The Elements of Graphing Data:
With that introduction out of the way, let’s talk about Wilkinson-style dot plots and NumberLinePlot
.
In a nutshell, NumberLinePlot
takes a list of numbers and produces a graph in which the numbers are plotted as dots along a horizontal axis—very much like the number lines I learned about in elementary school. I wondered what it would do if some of the numbers were repeated, so I defined this list of 30 integers with a bunch of repeats,
a = {7, 2, 4, 8, 5, 7, 4, 2, 1, 3, 7, 5, 7, 8, 7,
2, 5, 6, 7, 6, 1, 5, 2, 1, 6, 5, 8, 3, 8, 3}
and ran NumberLinePlot[a]
. Here’s the output:
Unfortunately, it plots the repeated numbers on top of one another so you can’t see how many repeats there are. I looked through the documentation to see if I could pass an option to NumberLinePlot
to get it to stack the repeated dots vertically, but no go.
There is, however, a way to get NumberLinePlot
to stack dots. You have to pass it a nested list set up with sublists having no repeated elements. For example, if we rewrite our original flat list, a
, this way,
b = {{1, 2, 3, 4, 5, 6, 7, 8}, {1, 2, 3, 4, 5, 6, 7, 8},
{1, 2, 3, 5, 6, 7, 8}, {2, 5, 7, 8}, {5, 7}, {7}}
and run NumberLinePlot[b]
, we get this graph:
As you can see, each sublist is plotted as its own row. That’s why each row has a different color: Mathematica sees this as a combination of six number line plots and gives each its own color by default. But don’t worry about the colors or the size and spacing of the dots; those can all be changed by passing options to NumberLinePlot
. What you should worry about is how to turn flat list a
into nested list b
. Doing it by hand is both tedious and prone to error.
As it happens, someone has already worked that out and combined it with NumberLinePlot
to make an external function, DotPlot
, that makes the kind of graph we want. Running ResourceFunction["DotPlot"][a]
produces
Again, we can add options to DotPlot
to give it the style Wilkinson suggests. Calling it this way,
ResourceFunction["DotPlot"][a, Ticks -> {Range[8]},
AspectRatio -> .2, PlotRange -> {.5, 8.5},
PlotStyle -> Directive[Black, PointSize[.028]]]
gives us
After setting the aspect ratio to 1:5, I had to fiddle around with the PointSize
to get the dots to sit on top of one another. It didn’t take long.
This is great, but I wanted to know what DotPlot
was doing behind the scenes. So I downloaded its source notebook and poked around. The code wasn’t especially complicated, but there was one very clever piece. Let’s go through the steps.
First, we Sort
the original flat list and Split
it into a list of lists in which each sublist consists of a single repeated value.
nest = Split[Sort[a]]
yields
{{1, 1, 1}, {2, 2, 2, 2}, {3, 3, 3}, {4, 4}, {5, 5, 5, 5, 5},
{6, 6, 6}, {7, 7, 7, 7, 7, 7}, {8, 8, 8, 8}}
Compare this with the nested list b
we used above and you can see that they’re sort of opposites. If we think of these nested lists as matrices, the rows of b
are the columns of nest
, and vice versa. That suggests a Transpose
of nest
would get us to b
, but the problem is that neither nest
nor b
are full matrices. They both have rows of different lengths.
The solution to this is a three-step process:
- Pad the rows of
nest
with dummy values so they’re all the same length. - Transpose that padded list of lists.
- Delete the dummy values from the transpose.
The code for the first step is
maxlen = Max[Map[Length, nest]];
padInnerLists[l_] := PadRight[l, maxlen, "x"]
padded = Map[padInnerLists, nest]
This sets maxlen
to 6 and pads all of nest
’s rows out to that length by adding x
strings. The result is this value for padded
:
{{1, 1, 1, "x", "x", "x"}, {2, 2, 2, 2, "x", "x"},
{3, 3, 3, "x", "x", "x"}, {4, 4, "x", "x", "x", "x"},
{5, 5, 5, 5, 5, "x"}, {6, 6, 6, "x", "x", "x"},
{7, 7, 7, 7, 7, 7}, {8, 8, 8, 8, "x", "x"}}
I should mention here that Map
works the same way mapping functions work in other languages: it applies function to each element of a list and returns the list of results.
Now we can transpose it with
paddedT = Transpose[padded]
which yields
{{1, 2, 3, 4, 5, 6, 7, 8}, {1, 2, 3, 4, 5, 6, 7, 8},
{1, 2, 3, "x", 5, 6, 7, 8}, {"x", 2, "x", "x", 5, "x", 7, 8},
{"x", "x", "x", "x", 5, "x", 7, "x"},
{"x", "x", "x", "x", "x", "x", 7, "x"}}
The last step is to use DeleteCases
to get rid of the dummy values.
nestT = DeleteCases[paddedT, "x", All]
yields
{{1, 2, 3, 4, 5, 6, 7, 8}, {1, 2, 3, 4, 5, 6, 7, 8},
{1, 2, 3, 5, 6, 7, 8}, {2, 5, 7, 8}, {5, 7}, {7}}
which is exactly what we want. This is what DotPlot
does to prepare the original flat list for passing to NumberLinePlot
.1
Now that I know how DotPlot
works, I feel comfortable using it.
You may be wondering why I don’t just run
Histogram[a, {1}, Ticks -> {Range[8]}]
to get a histogram that presents basically the same information.
First, where’s the fun in that? Second, dot plots appeal to my sense of history. They have a kind of hand-drawn look that reminds me of using tally marks to count items in the days before we all used computers.
-
DotPlot
uses anonymous functions and more compact notation than I’ve used here. But the effect is the same, and I wanted to avoid weird symbols like#
,&
, and/@
. ↩
Baseball durations after the pitch clock
September 30, 2025 at 8:20 PM by Dr. Drang
A couple of years ago, after Major League Baseball announced they’d start using a pitch clock, I said I would write a post about the change in the duration of regular season games. I didn’t. My recent post about Apple’s Sports app and its misunderstanding of baseball standings reminded me of my broken promise. I decided to write a new game duration post after the 2025 season ended.
Unfortunately, Retrosheet, which is where I get my data, hasn’t published its 2025 game logs yet. I decided to go ahead with the post anyway, mainly because putting it off would probably lead to my forgetting again. So here’s the graph of regular season durations through the 2024 season. When the 2025 data comes in, I’ll update it (I hope).
The black line is the median duration, and the light blue zone behind it is the interquartile range. The middle 50% of games lie within that range.
Clearly, there was a huge dropoff in 2023, so I’d say the pitch clock was a big success. The further reduction in 2024 was within the historical year-to-year duration change—I wouldn’t attribute any significance to it.
Games are now roughly as long as they were in the early ’80s, so the powers at MLB have cut about four decades of fat from the game. And they did it without reducing the number of commercials, because they’d never do that.
I made the graph more or less the same way I made the last one, although I combined some of the steps into a single script. First, I downloaded and unzipped all the yearly game logs since 1920 from Retrosheet. They have names like gl1920.txt
. Then I converted the line endings from DOS-style to Unix-style via
dos2unix gl*.txt
I got the dos2unix
command from Homebrew.
I concatenated all the data into one big (189 MB) file using
cat gl*.txt > gl.csv
Although the files from Retrosheet have a .txt
extension, they’re actually CSV files (albeit without a header line). That’s why I gave the resulting file a .csv
extension.
I then ran the following script, which uses Python and Pandas to make a dataframe with just the columns I want from the big CSV file and calculate the quartile values for game durations on a year-by-year basis.
python:
1: #!/usr/bin/env python3
2:
3: import pandas as pd
4: from scipy.stats import scoreatpercentile
5: import sys
6:
7: # Make a simplified dataframe of all games with just the columns I want
8: cols = [0, 3, 4, 6, 7, 9, 10, 18]
9: colnames = 'Date VTeam VLeague HTeam HLeague VScore HScore Time'.split()
10: df = pd.read_csv('gl.csv', usecols=cols, names=colnames, parse_dates=[0])
11:
12: # Add a column for the year
13: df['Year'] = df.Date.dt.year
14:
15: # Use the dataframe created above to make a new dataframe
16: # with the game duration quartiles for each year
17: cols = 'Year Q1 Median Q3'.split()
18: dfq = pd.DataFrame(columns=cols)
19: for y in df.Year.unique():
20: p25 = scoreatpercentile(df.Time[df.Year==y], 25)
21: p50 = scoreatpercentile(df.Time[df.Year==y], 50)
22: p75 = scoreatpercentile(df.Time[df.Year==y], 75)
23: dfq.loc[y] = [y, p25, p50, p75]
24:
25: # Write a CSV file for the yearly duration quartiles
26: dfq.Year = dfq.Year.astype('int32')
27: dfq.to_csv('gametimes.csv', index=False)
This code is very similar to what I used a couple of years ago. You can read that post if you want an explanation of any of the details.
Now I had a new CSV file, called gametimes.csv
, that looked like this
Year,Q1,Median,Q3
1920,99.0,109.5,120.0
1921,100.0,111.0,125.0
1922,100.0,110.5,124.0
1923,102.0,112.0,125.0
1924,101.0,112.0,125.0
1925,102.0,114.0,127.0
[and so on]
The graph was made by running this script, which reads the data from gametimes.csv
and uses Matplotlib to output the PNG image shown above:
python:
1: #!/usr/bin/env python3
2:
3: import pandas as pd
4: import matplotlib.pyplot as plt
5: from datetime import date
6:
7: # Import game time data
8: df = pd.read_csv('gametimes.csv')
9:
10: # Create the plot with a given size in inches
11: fig, ax = plt.subplots(figsize=(6, 4))
12:
13: # Add the interquartile range and the median
14: plt.fill_between(df.Year, df.Q1, df.Q3, alpha=.25, linewidth=0, color='#0066ff')
15: ax.plot(df.Year, df.Median, '-', color='black', lw=2)
16:
17: # Gridlines and ticks
18: ax.grid(linewidth=.5, axis='x', which='major', color='#bbbbbb', linestyle='-')
19: ax.grid(linewidth=.5, axis='y', which='major', color='#bbbbbb', linestyle='-')
20: ax.tick_params(which='both', width=.5)
21:
22: # Title and axis labels
23: plt.title('Baseball game durations')
24: plt.xlabel('Year')
25: plt.ylabel('Minutes per game')
26:
27: # Save as PNG
28: day = date.today()
29: dstring = f'{day.year:4d}{day.month:02d}{day.day:02d}'
30: plt.savefig(f'{dstring}-Baseball game durations.png', format='png', dpi=200)
I don’t know when the folks at Retrosheet will add the 2025 data. Maybe after the World Series. I’ll check back then and update the post if there’s a new year of game logs.
Sparking joy
September 28, 2025 at 3:55 PM by Dr. Drang
I watched this Numberphile video a few days ago and learned not only about Harshad numbers, which I’d never heard of before, but also some new things about defining functions in Mathematica.
Harshad numbers1 are defined as integers that are divisible by the sum of their digits. For example,
so 42 is a Harshad number. On the other hand,
so 76 is not. You can use any base to determine whether a number is Harshad or not, but I’m going to stick to base 10 here.
After watching the video, I started playing around with Harshad numbers in Mathematica. Because Mathematica has both a DigitSum
function and a Divisible
function, it’s fairly easy to define a Boolean function that determines whether a number is Harshad or not:
myHarshadQ[n_] := Divisible[n, DigitSum[n]]
Mathematica functions that return Boolean values often end with Q (Divisible
being an obvious exception), so that’s why I put a Q at the end of myHarshadQ
. A couple of things to note:
- Both
Divisible
andDigitSum
accept and return lists. Therefore, we can givemyHarshadQ
a list of numbers and it will return a list of Boolean values. - Because
DigitSum
accepts only integer arguments,myHarshadQ
throws an error if you feed it noninteger arguments.
To get all the Harshad numbers up through 100, we can use the Select
and Range
functions like this:
Select[Range[100], myHarshadQ]
which returns the list
{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12,
18, 20, 21, 24, 27, 30, 36, 40, 42, 45, 48,
50, 54, 60, 63, 70, 72, 80, 81, 84, 90, 100}
You can check this against Sequence A005349 in the On-Line Encyclopedia of Integer Sequences.
There are a couple of ways to count how many Harshad numbers are in a given range. Either
Length[Select[Range[100], myHarshadQ]]
which is a straightforward extension of what we did above, or the less obvious
Count[myHarshadQ[Range[100]], True]
which works by making a list of 100 Boolean values and counting how many are True
. The Count
function is like Length
, but it returns only the number of elements in a list that match a pattern. Both methods return the correct answer of 33, and they take about the same amount of time (as determined through the Timing
function) to run.
While Mathematica doesn’t have a built-in function for determining whether a number is Harshad, it does have an external function, HarshadNumberQ
, that does. This can be used in conjunction with ResourceFunction
. The call
ResourceFunction["HarshadNumberQ"][42]
returns True
. Similarly,
Select[Range[100], ResourceFunction["HarshadNumberQ"]]
returns the list of 33 numbers given above.
The resource function is considerably faster than mine. Running
Timing[Count[ResourceFunction["HarshadNumberQ"][Range[10000]], True]]
returns {0.027855, 1538}
, while running
Timing[Count[myHarshadQ[Range[10000]], True]]
returns {0.06739, 1538}
. The runtime, which is the first item in the list, changes slightly from one trial to the next, but myHarshadQ
always takes nearly three times as long.
To figure out why, I downloaded the source notebook for HarshadNumberQ
and took a look. Here’s the source code for the function:
SetAttributes[HarshadNumberQ, Listable];
iHarshadNumberQ[n_, b_] := Divisible[n, Total[IntegerDigits[n, Abs[b]]]]
HarshadNumberQ[n_Integer, b_Integer] :=
With[{res = iHarshadNumberQ[Abs[n], b]}, res /; BooleanQ[res]]
HarshadNumberQ[n_Integer] := HarshadNumberQ[n, 10];
HarshadNumberQ[_, Repeated[_, {0, 1}]] := False
I won’t pretend to understand everything that’s going on here, but I have figured out a few things:
First, HarshadNumberQ
is obviously defined to handle any number base, not just base-10. And it can handle a list of bases, too, so if you want to check on how many bases in which a number is Harshad, this is the function you want.
Second, HarshadNumberQ
uses an argument pattern I’ve never seen before: n_Integer
. This restricts it to accepting only integer inputs. It returns False
when given noninteger arguments; it doesn’t just throw an error the way myHarshadQ
does.
Third, HarshadNumberQ
uses Total[IntegerDigits[n]]
instead of DigitSum[n]
. This, I believe, is at least part of the reason it runs faster than myHarshadQ
. For example,
Timing[Total[IntegerDigits[99999^10]]]
returns {0.000084, 216}
, while
Timing[DigitSum[99999^10]]
returns {0.000146, 216}
.
Finally, the SetAttributes
function is needed because Total
would otherwise have trouble handling the output of IntegerDigits
when the first argument is a list. Let’s look at some examples:
IntegerDigits[123456]
returns {1, 2, 3, 4, 5, 6}
, a simple list that Total
knows how to sum to 21. But
IntegerDigits[{12, 345, 6789}]
returns a lists of lists, {{1, 2}, {3, 4, 5}, {6, 7, 8, 9}}
, which Total
has trouble with. Calling
Total[IntegerDigits[{12, 345, 6789}]]
returns an error, saying that lists of unequal length cannot be added. We can get around this error by giving Total
a second argument. Calling it as
Total[IntegerDigits[{12, 345, 6789}], {2}]
tells Total
to add the second-level lists, returning {3, 12, 30}
, which is just what we want. The problem is that this second argument to leads to an error when the first argument is a scalar instead of a list.
This, as best I can tell, is where the
SetAttributes[HarshadNumberQ, Listable];
line comes in. By telling Mathematica that HarshadNumberQ
is Listable
, we can define the function as if Total
were always being used on scalars, and the system will thread over lists just like we want.
Since I’m just writing myHarshadQ
for my own entertainment and not for others to use, I can take some of what I learned above and rewrite it to run faster. Like this:
SetAttributes[myHarshadQ, Listable];
myHarshadQ[n_] := Divisible[n, Total[IntegerDigits[n]]]
With this new definition, myHarshadQ
runs considerably faster. Calling
Timing[Count[myHarshadQ[Range[10000]], True]]
returns {0.013527, 1538}
. This is roughly twice as fast as HarshadNumberQ
, probably because the definition doesn’t deal with other bases or handle errors gracefully.
Now I can look at Harshad numbers over a broad range without having to wait long for results. To see how they’re distributed over the first million numbers, I ran
Histogram[Select[Range[999999], HarshadNumberQ], {25000}, ImageSize -> Large]
and got this histogram (with a bin width of 25,000):
The stepdown pattern repeats every 100,000 numbers, moving down each time. Let’s stretch out the range to three million and see what happens.
Histogram[Select[Range[2999999], HarshadNumberQ], {25000}, ImageSize -> Large]
The pattern repeats at the larger scale, too.
If you want to learn some other Harshad facts, you should watch this extra footage at Numberphile. There doesn’t seem to be anything directly practical you can do with Harshad numbers, but you can use them to exercise your mathematical muscles. Or learn new things about the Wolfram Language.
-
You might think Harshad numbers are named after their discoverer or, in accordance with Stigler’s Law, their popularizer, but no. See either the video or the Wikipedia article for the origin of the name. ↩