Re: Very slow directory traversal

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Oct 06, 2007  00:10 -0700, Ross Boylan wrote:
> My last full backup of my Cyrus mail spool had 1,393,569 files and
> cconsumed about 4G after compression. It took over 13 hours.  Some
> investigation led to the following test:
>  time tar cf /dev/null /var/spool/cyrus/mail/r/user/ross/debian/user/

FYI - "tar cf /dev/null" actually skips reading any file data.  The
code special cases /dev/null and skips the read entirely.

> That took 15 minutes the first time it ran, and 32 seconds when run
> immediately thereafter.  There were 355,746 files. This is typical of
> what I've been seeing: initial run is slow; later runs are much faster.

I'd expect this is because on the initial run the on-disk inode ordering 
causes a lot of seeks, and later runs come straight from memory.  Probably
not a lot you can do directly, but e.g. pre-reading the inode table would
be a good start.


> I found some earlier posts on similar issues, although they mostly
> concerned apparently empty directories that took a long time.  Theodore
> Tso had a comment that seemed to indicate that hashing conflicts with
> Unix requirements.  I think the implication was that you could end up
> with linearized, or partly linearized searches under some scenarios.
> Since this is a mail spool, I think it gets lots of sync()'s.

There was an LD_PRELOAD library that Ted wrote that may also help:
http://marc.info/?l=mutt-dev&m=107226330912347&w=2

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.

_______________________________________________
Ext3-users mailing list
Ext3-users@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/ext3-users

[Index of Archives]         [Linux RAID]     [Kernel Development]     [Red Hat Install]     [Video 4 Linux]     [Postgresql]     [Fedora]     [Gimp]     [Yosemite News]

  Powered by Linux