New SciPy Superpack
November 28th, 2012 at 9:17 pm by Dr. Drang
Before I go any further, I should stress that the “new” in the title of the post means “new for me,” not “new to the rest of the world.” Because I’m still running Lion1 I need to use a version of the Superpack that’s a little behind the times. But it’s more up-to-date than the one I installed back in June.
The Superpack I’m talking about is Chris Fonnesbeck’s SciPy Superpack, a set of Python modules and a shell script that ties everything together and does the installation. The modules you get in the Superpack are:
- IPython (version 0.14.dev) This is the interactive Python shell that offers more simply running
pythonfrom command line. In the past I’ve thought it not worth the effort, but after reading more about it in Wes McKinney’s Python for Data Analysis, I’ve decided to give it a whirl. - NumPy (version 1.8.0.dev_436a28f_20120710) This is the base library upon which all other numerical and scientific Python modules are built. It provides classes and methods for fast matrix computations.
- SciPy (version 0.10.1) This is a set of modules for performing common scientific and engineering computation: numerical integration, optimization, linear algebra, fourier transforms, statistics, and so on.
- matplotlib (version 1.2.x) The plotting library I’ve been using in place of Gnuplot.
- pandas (version 0.8.1.dev_8cc9826_20120717) A data analysis package for reading, writing, and manipulating data files. Given how much time I’ve spent in the past manipulating files to get them in shape for analysis, pandas seems like a godsend. But I haven’t had a good project to use it on yet.
- pymc (version 2.2) A module for Markov Chain Monte Carlo sampling. I haven’t done anything with Markov chains in ages, so I don’t see myself using this one.
- scikit_learn (version 0.12_git) A module for machine learning, something I’ve never dabbled in and don’t ever expect to.
- Statsmodels (version 0.5.0) A module for (surprise!) the statistical modeling of data. I can see myself using this, but I haven’t so far.
It also installs gFortran because some parts of these modules need to be compiled.
When I first installed the Superpack in June, everything worked fine, but there was one small annoyance. Every time I imported something from the scipy.stats library, I’d get this warning:
RuntimeWarning: numpy.ndarray size changed, may indicate binary incompatibility
I soon learned that this was not a warning I needed to worry about, but I still didn’t like seeing it. My hope was that more recent versions of these packages would suppress the warning. Luckily, that turned out to be true, and now I have the modules I want with no extraneous warnings when my scripts run.
The newer versions of the modules are installed as .egg files alongside the older versions in /Library/Python/2.7/site-packages. I assume I can just delete the older versions with no ill effect, but I haven’t bothered and probably won’t. Based on my experience in upgrading from Snow Leopard to Lion, I fully expect all of my site-packages libraries to get wiped out. There’s no sense in doing any maintenance on a directory that’s likely to need a complete rebuild in a month or so.
If you’re running Mountain Lion (and I suspect most of you are), you should get the fully up-to-date Superpack for 10.8. It has more recent versions of almost all of the packages.




November 29th, 2012 at 7:29 am
I’ve been trying out a different scipy/numpy distribution: Enthought. It includes the same core packages: numpy, scipy, ipython and matplotlib but differs from the Superpack in the other additional packages.
But I’ve been trying to use virtualenv for my various Python environments and none of these packaged distributions play nicely with virtualenv. I was curious if you’ve tried using virtualenv and if so, have tried to get the Superpack to install in a virtual environment.
November 29th, 2012 at 10:38 am
David: I’ve not used it (yet), but the install script for Fonnesbeck’s SciPy Superpack does seem designed to specifically recognize and adapt to the situation where it’s run within a virtualenv.
The catch is you have to make sure that the virtualenv is activated first (i.e., run the activate script for it); it looks like having your virtualenv as the default Python distribution isn’t quite enough.
On an unrelated note: Dr Drang: Now that the build-to-order prices have been (most likely) revealed, what specs do you have your eye on for the new iMac?
November 29th, 2012 at 12:27 pm
David,
My impression is the same as Wes’s—that the Superpack is built to work in a virtualenv. But, like Wes, I don’t use virtualenv and don’t have direct experience.
Wes,
As I said somewhere (Twitter?), the various roll-your-own Fusion drive articles have made me much more comfortable with it. I suspect I’ll be getting a 3 TB Fusion and will do some sort of third-party RAM upgrade.