Bumping into OS X’s file limits
July 10th, 2006 at 9:27 pm
I’ve been working on a database problem for the past few days, and because of the size of the problem I’ve learned a bit about the limits of the OS X file system.
The database organizes several thousand scanned pages, JPEGs and TIFFs. The database ties the pages together into a few thousand documents of various lengths and includes the OCR’d text of the scanned pages. I’m using FileMaker because I need to share the database with others that are not always going to be hooked up to a network. FileMaker is cross-platform and combines both a client and server.
My first thought was to import the scans directly into the database. This made the file about 13 gigabytes and I found I couldn’t copy it from one disk to another. Not in the Finder, not with the Unix cp command, not with rsync. In each case, I got an error message that said, in effect, “This file is too big to copy.”
I compressed the file down to just under 7 gigabytes with gzip and tried to copy it again—same error message.
Lesson 1: OS X cannot copy files of several gigabytes from disk to disk.
Back to the database. I cleared out the scanned images and reimported them as references. This kept the database down to a reasonable size of 30 megabytes or so. Which worked fine. Before doing this, though, I tried to copy all the JPEGs and TIFFs into a simpler directory structure than the one in which I received them. And this is where I ran into the second limitation: when you try to copy thousands of files with a single drag of the Finder or single cp command you get an error message that tells you the file list is too long (I didn’t try rsync; it wasn’t appropriate because I wasn’t trying to duplicate the existing directory structure).
Lesson 2: OS X cannot copy several thousand files in one go.
I worked around this problem by issuing several cps, each working on a subset of the scanned files. These were put into several different folders. None of the folders has more than about a thousand files in it, so I won’t run into the limitation if I need to move them again.
What are the exact limits? Dunno. But now I know enough to keep away from them.











July 11th, 2006 at 3:31 am
About ‘cp’ not being able to deal with too many files, this might be a limit of the size of the command line. The same problem often occurs with Linux.
July 11th, 2006 at 1:43 pm
Alan,
I believe you are absolutely right: it’s not so much the number of files, but the number of characters required to list those files. Even though we don’t type very many characters, wildcards can expand into very long strings.
I don’t remember ever running into this on Linux, but it doesn’t surprise me. I guess I just never tried to handle that many files before.