On Tue, Oct 26, 2004 at 04:52:12PM -0500, Matt Mitchell wrote:
Just seeking some opinions here...
I have observed some really poor performance in GFS when dealing with large numbers of small files. It seems to be designed to scale well with respect to throughput, at the apparent expense of metadata and directory operations, which are really slow. For example, in a directory with 100,000 4k files (roughly) a simple 'ls -l' with lock_dlm took over three hours to complete on our test setup with no contention (and only one machine had the disk mounted at all). (Using Debian packages dated 16 September 2004.)
Lots of small files can certainly expose some of the performance limitations of gfs. "Hours" sounds very odd, though, so I ran a couple sanity tests on my own test hardware.
One node mounted with lock_dlm, the directory has 100,000 4k files, running "time ls -l | wc -l".
- dual P3 700 MHz, 256 MB, some old FC disks in a JBOD 5 min 30 sec
- P4 2.4 GHz, 512 MB, iscsi to a netapp 2 min 30 sec
Having more nodes mounted didn't change this. (Four nodes of the first kind all running this at the same time averaged about 17 minutes each.)
My initial setup was using a dinky SCSI-IDE RAID box that happened to have two interfaces. Now that we have the fibre channel hardware in-house I am recreating the setup on it in order to get some performance numbers on that.
It seems like there is a lot of contention for the directory inodes (which is probably unavoidable) and that would likely be helped by segregating the files into smaller subdirectories. Implementation-wise, is there a magic number or formula to follow for sizing these directories? Does the number of journals make a difference?
-m