SciPy and image analysis

“Image analysis” is a little too hifalutin for what I did today, but it was fun and I solved a real problem.

I had a scanned drawing of the cross section of a hollow extruded aluminum part and needed to calculate the enclosed volume. Because the part’s exterior and interior surfaces were curved—and not arcs of circles or ellipses—straightforward area calculations weren’t possible. But I figured I could make a good estimate by counting pixels and scaling.

The drawing looked sort of like this, only more complicated. There were internal partition walls and more dimension lines.

Dimensioned drawing

I opened the scan in Acorn, erased the dimension lines, and filled the solid parts with black and the hollows with 50% gray. Then I cropped it down to the smallest enclosing rectangle, the (physical) dimensions of which were given on the drawing. I ended up with something like this:

Cleaned and shaded

The image I had was dirtier than this because there were antialiasing artifacts from the scanning process, but you get the idea.

I had hopes that I could get the count of gray pixels directly from a histogram in Acorn, but I couldn’t find a command that would do that, so I shifted to Python.

The misc sublibrary of SciPy has an imread function that was just what I needed. It reads an image file (PNG, TIFF, JPEG) and turns it into a NumPy array of RGBA or gray values. With that array in hand, I could just scan through it, count the pixels that are at or near 50% gray, and calculate their percentage of the total. Here’s the script:

python:
 1:  #!/usr/bin/python
 2:  
 3:  from scipy import misc
 4:  import sys
 5:  
 6:  img = misc.imread(sys.argv[1], flatten=True)
 7:  white = gray = black = 0
 8:  lower = 255/3
 9:  upper = 2*lower
10:  height, width = img.shape
11:  
12:  for i in range(height):
13:    for j in range(width):
14:      if img[i,j] >= lower:
15:        if img[i,j] <= upper:
16:          gray += 1
17:        else:
18:          white += 1
19:      else:
20:        black += 1
21:  
22:  all = width*height
23:  print "Total pixels: %d" % all
24:  print "White pixels: %d (%5.2f%%)" % (white, 100.0*white/all)
25:  print "Black pixels: %d (%5.2f%%)" % (black, 100.0*black/all)
26:  print "Gray pixels:  %d (%5.2f%%)" % (gray, 100.0*gray/all)

I did a bit more than was needed, counting the white and black pixels as well as the gray.

Line 6 does the hard work—reading in the file, converting it to grayscale (with flatten=True), and putting it into an array. The tonal range of 255 was split in thirds in Lines 8 and 9 and every pixel within each third was lumped together. If I’d chosen different values for lower and upper, I would’ve gotten different results, but not too much different. The great majority of pixels had values of either 0, 128, or 255; only the antialiasing pixels at the edges of the lines were different.

The results looked like this:

Total pixels: 126003
White pixels: 63342 (50.27%)
Black pixels: 39870 (31.64%)
Gray pixels:  22791 (18.09%)

Multiplying the percentage of grays by the physical height and width of the enclosing rectangle gave me the cross-sectional area of the hollow. Multiplying that by the length of the extrusion gave me the volume. Two significant digits was all I really needed in the result, which is why I didn’t stress over the antialiasing pixels.

There are, I know, commercial programs that can do this and more. But most of them run on Windows (because most engineers use Windows), and the time I would’ve spent finding one and learning how to use it couldn’t have been too much less than the time it took to write 26 lines of code. And I know exactly how this code works.

Update 9/17/14
Alexandre Chabot rewrote my script to get rid of the loops in Lines 12–21 and replace them with NumPy’s sum function and a set of array-based Boolean expressions. For example,

python:
white = np.sum(img > upper)

returns the count of all the white pixels. The expression in the argument, img > upper compares each item in img to upper and returns an array of Trues and Falses. When that’s fed to sum, it returns the sum of all the Trues. Very nice.

Treating arrays in chunks like this is how NumPy is supposed to be used. I used loops because that’s what I’ve been doing for 35 years and old habits are hard to break.


PCalc construction set

One of the many great stories at Andy Herzfeld’s folklore.org site is about Chris Espinosa’s creation of the Mac’s original calculator desk accessory.

We all gathered around as Chris showed the calculator to Steve and then held his breath, waiting for Steve’s reaction. “Well, it’s a start”, Steve said, “but basically, it stinks. The background color is too dark, some lines are the wrong thickness, and the buttons are too big.” Chris told Steve he’ll keep changing it, until Steve thought he got it right.

So, for a couple of days, Chris would incorporate Steve’s suggestions from the previous day, but Steve would continue to find new faults each time he was shown it. Finally, Chris got a flash of inspiration.

The next afternoon, instead of a new iteration of the calculator, Chris unveiled his new approach, which he called “the Steve Jobs Roll Your Own Calculator Construction Set”. Every decision regarding graphical attributes of the calculator were parameterized by pull-down menus. You could select line thicknesses, button sizes, background patterns, etc.

Steve took a look at the new program, and immediately started fiddling with the parameters. After trying out alternatives for ten minutes or so, he settled on something that he liked.

With version 3.3 of PCalc, James Thomson has gone Espinosa one better: he’s not only built a customizable PCalc, he’s given all of us the power of Steve Jobs.

(Oh, yeah. A lot of the new stuff in PCalc 3.3 has to do with iOS 8. I know nothing about these features because I don’t install beta operating systems on my workaday devices and I don’t have any spare iPhones lying around. It’s the customizable layouts I’ve been lusting after.)

Here’s the Engineering layout I’ve been using for ages, both the normal and 2nd configuration:

Engineering layout

It’s perfectly functional, but it isn’t exactly what I want. For example, I almost never need the log2 and 2x functions. With 3.3, I don’t have to see anything I don’t want.

Here’s the new Drang layout:

Drang layout

The first thing you should notice is that all the buttons (apart from the top row) are now the same size, so my fingers have bigger targets to hit. By getting rid of the log2 and 2x keys, the percentage keys, and the hyperbolic trig keys,1 I eliminated a whole row and was able to increase the height of what was left.

You might also notice that I have both the natural log and exponential functions on the normal layout. I use both of these functions a lot, and even though it would have been more consistent to have one as the 2nd key of the other, it was more practical to have them side-by-side.

A foolish consistency is the hobgoblin of little minds, adored by little statesmen and philosophers and divines.
— Ralph Waldo Emerson

To make your own layout, it’s best to start with the built-in layout that most closely resembles what you want to end up with. Go into the Settings, open the list of layouts (vertical in this case), and tap the Edit button. You’ll get a chance to duplicate one of the built-ins and give it its own name.

Duplicating a layout

Then go back to the regular calculator view and start editing the buttons.

To edit a button, press and hold on it until the display shifts and handles appear at the corners of the button. You can use the handles to resize the button, and you can drag it around to any place you like.

Editing a button

To change what the button does, tap the Edit button along the bottom, and a screen will appear that’ll let you change the name and the behavior of the button. You can have it work like any of the regular commands, run a user function, perform a unit conversion, or insert a constant. You can have the button appear in the normal view, the 2nd view, or both.

Because I often do calculations involving the standard normal distribution, I added buttons that calculate its cumulative distribution function (CDF) and inverse CDF. I’ve had these user-defined functions available since PCalc 2.8, but until now I’ve had to dig my way through the f(x) button to get at them.

The Drang layout isn’t particularly imaginative, but with a little thought and a little programming, you could turn PCalc into a whole series of special-purpose calculators. A financial calculator layout like the HP 12C would be a snap. You could also make a cooking layout with buttons for all the conversions between cups, ounces, teaspoons, tablespoons, etc. Metallurgists could create a layout that converts between the many hardness values.

You could, of course, use Numbers or Pythonista to perform these calculations and conversions. But there’s something very efficient about calculator model of tapping in a number and hitting a single button to get your answer. And with the new PCalc Construction Set, you can build a calculator that has buttons for everything you need and nothing you don’t.


  1. I asked James to add the hyperbolic trig keys a few years ago because they are necessary for some engineering and scientific calculations, but I don’t use them enough to have them on my everyday layout. When I need them, I can always switch to the Engineering layout. 


SGML nostalgia

When I switched to Linux in the late ’90s, I needed a way to write reports and correspondence for work. At the time, there weren’t any open source word processors worth mentioning, and I was done with wordprocessors, anyway. So I set up a report-writing workflow based on SGML, HTML’s big brother, and groff, the GNU version of the ancient Unix text formatter, troff.

SGML workflow

I actually enjoyed writing in SGML. Creating a DTD for my reports forced me to think hard about how they ought to be structured. Although my current workflow is different, and I write my reports in Markdown, I still structure them according to the rules I had to formalize back in 1997. And SGML isn’t the straightjacket that XML is; you don’t need closing tags—or even opening tags—if there’s no way to misinterpret an element.

I kind of went SGML-happy in the late ’90s, creating DTDs for every type of structured document I wrote, including my CV. The workflow for generating a PostScript version of my CV was basically the same as the one for reports. Here’s my CV DTD:

xml:
 1:  <!ELEMENT cv    - - (name, pos, intro?, s+)>
 2:  <!ELEMENT name  - O (#PCDATA)>
 3:  <!ELEMENT pos   - O (#PCDATA)>
 4:  <!ELEMENT intro - O (#PCDATA)>
 5:  <!ELEMENT s     - O (h, (item|ditem)*)>
 6:  <!ELEMENT h     O O (#PCDATA)>
 7:  <!ELEMENT item  - O (#PCDATA | cite | br)*>
 8:  <!ELEMENT ditem - O (#PCDATA | cite | br)*>
 9:  <!ATTLIST ditem  date  CDATA  #REQUIRED>
10:  <!ELEMENT br    - O EMPTY>
11:  <!ELEMENT cite  - - (#PCDATA)>

The structure isn’t too hard to work out. The CV as a whole consists of my name, my position with the company, an optional introductory paragraph, and then one or more sections. Each section consists of a header followed by some number of items or dated items. Dated items must have a date attribute; otherwise they’re identical to regular items. Items of either sort can contain citations and line breaks.

Here’s an example:

xml:
 1:  <!DOCTYPE cv SYSTEM "/Users/drang/dtd/cv.dtd">
 2:  <cv>
 3:  <name>
 4:  Dr. Drang, Ph.D., P.E.
 5:  <pos>
 6:  Engineering Mechanics
 7:  <s>
 8:  Employment
 9:  <ditem date="1991-present">
10:  Principal, Drang Engineering, Inc.
11:  <ditem date="1985-1990">
12:  Assistant Professor, Small Big Ten University
13:  <s>
14:  Education
15:  <ditem date=1985>
16:  Ph.D. in Civil Engineering; University of Illinois at Urbana-Champaign<br>
17:  Thesis: <cite>An Approach To Structural Analysis That No One Uses</cite>
18:  <ditem date=1982>
19:  M.S. in Civil Engineering; University of Illinois at Urbana-Champaign
20:  <ditem date=1981>
21:  B.S. in Civil Engineering; University of Illinois at Urbana-Champaign
22:  <s>
23:  Professional societies
24:  <item>
25:  American Society of Civil Engineers
26:  <item>
27:  American Institute of Steel Construction
28:  <item>
29:  American Concrete Institute
30:  <s>
31:  Professional licenses and registrations
32:  <item>
33:  Professional Engineer, State of Illinois
34:  <item>
35:  Professional Engineer, State of Indiana
36:  <item>
37:  Professional Engineer, State of Ohio
38:  </cv>

Note that the only closing tags are for the <cv> and <cite> elements. If you look in the DTD, you’ll see - 0 in most of the element definitions. That means the opening tag is required but the closing tag is optional. Both the opening and closing tags are optional for the <h> element; because it’s always the first element within an <s> and it’s always followed by either an <item> or a <ditem>, there’s no need for tags. The SGML processor will know that things like “Employment” and “Education” are <h> elements.

For several years I kept my CV in this form, updating it as necessary. Sometime after switching back to the Mac, I stopped maintaining the SGML version, updating only the troff version. Even though troff isn’t the easiest markup language to write in, adding an item to my CV was pretty simple. I’d just copy a chunk of formatting code from one item, paste it in, and then add the new text.

Yesterday, though, I needed to update a few items in the CV and had the bright idea to return to the SGML form. I still had an old SGML version, so it wasn’t too hard to add the stuff necessary to bring it up to date. But I soon realized I didn’t have an SGML processor—I’d never installed it on my iMac at work.

Back when I was using SGML regularly, the standard processor was nsgmls, part of James Clark’s SP suite of programs. I couldn’t find a precompiled version for OS X, so I decided to download the source and build it myself. Unfortunately, some of the commands in the makefile threw errors; something in either OS X’s compiler or its libraries wasn’t what the makefile expected. So I started a little yak-shaving adventure.

Installing gcc via Homebrew so I can compile an SGML processor so I can run a Perl program I wrote in 1996.

As you do.

Dr. Drang (@drdrang) Sep 12 2014 9:47 AM

Luckily, while gcc was compiling, continued Googling led me to a Homebrew recipe for OpenSP. I would never have guessed there was an OpenSP—Clark’s SP has always been open source. But after a

brew install open-sp

I was in business and was able to stop the installation of gcc and delete the dependencies had already been built. I generated my CV just as I had in the ’90s with only two differences:

Neither of these was a big deal. The pipeline looked like this:

onsgmls drangCV.sgml | cv2roff | groff | ps2pdf - > drangCV.pdf

The cv2roff part is a Perl script that converts the ESIS output of onsgmls into a troff document. I won’t be showing it here because it’s embarrassing. I had been programming Perl for less than a year when I wrote it, and it’s a mess. Worse, even, than my early Perl is the mixture of tabs and spaces in the source code. I’m sure I was using Emacs at the time and must not have known how to configure it yet. Ick.

Was it worth the trouble? I think so. Because of increased continuing education requirements to maintain my professional engineering licenses, and because I expect to be getting licensed in more states, I’ll be updating my CV more often. Having it in a concise SGML form will make it easier to edit. And even though my old Perl code is ugly, it’s fun to still be able to use a script I wrote over 15 years ago.


Terminal velocity

I recently read Ancillary Justice, the Hugo and Nebula award winning novel by Ann Leckie and enjoyed immensely, but it had one paragraph that brought me up short. The reason involves fluid mechanics, so it seemed like a good topic for a blog post.

But first the book. Ancillary Justice has a good story, but what hooked me was the way the story is told. The main character and narrator is an artificial intelligence that’s used primarily to run a military spaceship. The AI is shared among hundreds of reanimated corpses that act as the ship’s footsoldiers. The chapters alternate between two time periods about 20 years apart. Using multiple times is a common narrative device, but Leckie layers on other multiplicities.

First, because the narrator has hundreds of bodies, several of the scenes—fighting scenes in particular—are told from multiple points of view by the same narrator simultaneously. Second, as the characters move to different planets with different cultures and languages, they change genders. Their naughty bits and their personalities stay the same, but their perceived role in society flips. In scenes where two languages are being spoken, the gender of a character can change from one sentence to the next.

Leckie handles these shifts in time, position, and gender nicely, leaving you surprised but never confused. There’s also an interesting parallel between the main character and one of her former captains, a regular, non-corpse human who’s been awakened after a thousand years in suspended animation. Leckie’s light touch keeps their relationship interesting.

As for the paragraph that brought me up short, it comes during a scene in which the narrator and her former captain are falling together from a bridge. The narrator, an individual corpse soldier who’s cut off from the main AI, is trying to figure out how to slow the fall and survive.

If I had been more than just myself, if I had had the numbers I needed, I could have calculated our terminal velocity, and just how long it would take to reach it. Gravity was easy, but the drag of my pack and our heavy coats, whipping up around us, affecting our speed, was beyond me. It would have been much easier to calculate in a vacuum, but we weren’t falling in a vacuum.

Maybe this is just a clumsily written passage, but I don’t think so. Leckie isn’t a clumsy writer. As I read it, the “it” of the last sentence can only be either the terminal velocity or the time it takes to reach terminal velocity. But there is no terminal velocity in a vacuum. Objects that fall in a vacuum don’t stop accelerating until they hit the ground. The narrator of the story should know that, but apparently Leckie and her editors don’t.

Terminal velocity is the maximum speed an object achieves while falling through a fluid.1 When you drop something, gravity pulls it down while the viscosity of the fluid pushes back up. At first, the force of gravity is stronger than the resistance, and the object accelerates. But while gravity remains constant, the viscous resistive force increases with increasing velocity. Eventually, these two forces balance, and the object stops accelerating. It continues its fall at this constant, terminal velocity.

The relationship between velocity and viscous resistive force—called the drag force—is complicated and depends on the nature of the fluid. Generally speaking, in liquids the drag force is proportional to the velocity, while in gases the drag force is proportional to the square of the velocity. Walter Lewin does an excellent job of explaining the two regimes in this lecture from MIT’s Physics I course.

One way to express the drag force, [F_D], is through this equation:

[F_D = \frac{1}{2} C_D \rho A v^2]

where [\rho] is the density of the fluid, [A] is the cross-sectional area of the object, [v] is the velocity of the object, and [C_D] is the drag coefficient, which we’ll talk about more in a little while.

When a falling object reaches terminal velocity, the drag force equals the object’s weight. You probably recall from your physics class that weight can be expressed as [mg], where [m] is the mass of the object and [g] is the acceleration due to gravity.

Setting the two forces equal to one another,

[mg = \frac{1}{2} C_D \rho A v^2]

and solving for [v] gives

[v = \sqrt{\frac{2 mg}{C_D \rho A}}]

This is the expression for terminal velocity.

(Don’t be tempted to do some cancellation with [m] and [\rho] to get a volume. The [m] is the mass of the object and the [\rho] is the density of the fluid. They have no relationship.)

You might be wondering why we use an equation with just a [v^2] term when we said earlier that for some fluids the drag force is directly proportional to [v]. The answer lies in how we define the drag coefficient.

Note that we don’t call it the drag constant. That’s because there’s nothing constant about it. It depends on the shape of the falling object and, critically, the velocity of the falling object.

Nowadays, you can go to Wolfram Alpha to calculate the drag coefficient of, for example, a sphere, but when I was a student you’d get it off of graphs that looked like this:

Drag coefficient

From Engineering Fluid Mechanics by Roberson & Crowe

The horizontal axis is the Reynolds number, one of the many nondimensional numbers you run into in fluid mechanics. Supersonic flight has made the Mach number more well known, but the Reynolds number has more applications in engineering. Using our terminology, it’s defined as

[\mathrm{Re} = \frac{vd}{\nu}]

where [d] is some characteristic cross-sectional dimension (the diameter of sphere, for example) and [\nu] is the kinematic viscosity of the fluid. Low Reynolds numbers are associated with slow movement in viscous fluids, where the fluid flow around the object is laminar; high Reynolds numbers are associated with fast movement and turbulence.

In the upper left of the graph, you’ll see a straight line and the formula [C_D = 24/\mathrm{Re}]. This is the special case of laminar flow around a sphere, the situation covered by Stokes’ Law. If we substitute

[C_D = \frac{24 \nu}{v d}]

into our expression for [F_D] and express the cross-sectional area of the sphere in terms of its diameter, we get

[F_D = \frac{1}{2}\frac{24 \nu}{vd}\rho \left(\frac{\pi d^2}{4}\right) v^2 = 3\pi d \nu \rho v]

This is usually written

[F_D = 6 \pi \mu r v]

where we use the radius instead of the diameter and the dynamic viscosity, [\mu = \rho \nu]. Either way, you can see that for low Reynolds numbers, the definition of drag coefficient leads us to a result in which the drag force is directly proportional to velocity.

When I was a kid, you’d see examples of Stokes’ Law on TV all the time.

The pearl dropping through Prell was so iconic, commercials in the 70s could just show it without saying a word about it.

How can I be sure this commercial is from the 70s? Just look at Lauren Hutton’s hair at the end of it.

If Stokes’ Law holds, calculating the terminal velocity is pretty easy:

[mg = 6 \pi \mu r v] [v = \frac{mg}{6 \pi \mu r}]

At higher Reynolds numbers, where the flow is turbulent, the calculation isn’t so easy. You have to make a guess at the Reynolds number, get the drag coefficient from the graph, and plug that in to calculate the terminal velocity. Then you have to use that velocity to calculate the Reynolds number to see how good your initial guess was. Chances are, you’d have to adjust your guess and go through the process again.

If you’re on a relatively flat part of the graph, the drag coefficient doesn’t change much with the Reynolds number and your initial guess isn’t so critical. For example, a falling sphere can have a Reynolds number anywhere from 1,000 to 100,000 and the drag coefficient will pretty much stay in the range of 0.4 to 0.5.

Getting back to science fiction, there’s a famous example of terminal velocity in Arthur C. Clarke’s Rendezvous with Rama. One of the expedition party to Rama (a huge and mysterious cylindrical spaceship/world that spins to provide artificial gravity on its inside surface) is stuck on a 500-meter-high cliff and needs to be rescued. The solution is for him to simply jump off the cliff and fall into the ocean below. The artificial gravity is low and so, therefore, is his terminal velocity. He (spoiler alert!2) survives the impact with the water and is picked up by the rest of the crew.

Rendezvous with Rama, like Ancillary Justice, got both the Hugo and Nebula awards. Clarke may have understood physics better than Leckie, but his storytelling isn’t as inventive as hers.


  1. In colloquial English, fluid usually means liquid, but in mechanics a fluid can be either a liquid or a gas. 

  2. Speaking of spoiler alerts, The Incomparable podcast talked about Ancillary Justice in two episodes, The Partial Monty and Golem and Jinni Detective Agency