On Wed, 2007-10-10 at 09:59 -0600, Andreas Dilger wrote: > On Oct 06, 2007 00:10 -0700, Ross Boylan wrote: > > My last full backup of my Cyrus mail spool had 1,393,569 files and > > cconsumed about 4G after compression. It took over 13 hours. Some > > investigation led to the following test: > > time tar cf /dev/null /var/spool/cyrus/mail/r/user/ross/debian/user/ > > FYI - "tar cf /dev/null" actually skips reading any file data. The > code special cases /dev/null and skips the read entirely. > > > That took 15 minutes the first time it ran, and 32 seconds when run > > immediately thereafter. There were 355,746 files. This is typical of > > what I've been seeing: initial run is slow; later runs are much faster. > > I'd expect this is because on the initial run the on-disk inode ordering > causes a lot of seeks, and later runs come straight from memory. Probably > not a lot you can do directly, but e.g. pre-reading the inode table would > be a good start. Judging from your comments and the thread you reference below, the problem is that the order returned from readdir is not inode order. But if tar, in this special case (/dev/null), doesn't actually read from the file, why should it be so slow. Does it do something (stat?) that makes it have to fetch the inode anyway? > > > > I found some earlier posts on similar issues, although they mostly > > concerned apparently empty directories that took a long time. Theodore > > Tso had a comment that seemed to indicate that hashing conflicts with > > Unix requirements. I think the implication was that you could end up > > with linearized, or partly linearized searches under some scenarios. > > Since this is a mail spool, I think it gets lots of sync()'s. > > There was an LD_PRELOAD library that Ted wrote that may also help: > http://marc.info/?l=mutt-dev&m=107226330912347&w=2 > I got the code, but am not having much luck making it work. I've tried various things. The most recent is cc -shared -fpic -o libsd_readdir.so spd_readdir.c # as me # rest as root # export LD_LIBRARY_PATH=./ # export LD_PRELOAD=libsd_readdir.so # ldconfig -v -n $(pwd) /usr/local/src/kernel/ext3-patch: libsd_readdir.so -> libsd_readdir.so corn:/usr/local/src/kernel/ext3-patch# date; time tar cf /dev/null /var/spool/cyrus/mail/r/user/ross/pol/asdnet/ Wed Oct 10 23:16:44 PDT 2007 tar: Removing leading `/' from member names Segmentation fault I don't know how to make something for preload; can anyone give any hints? Should the module I'm attempting to load have any effect on the 15 minute time noted above for tar to /dev/null, or is it only relevant if I am pulling data off the disk files? Would there be any value in having some other program traverse the directories before I do the backup, or would cache limits likely mean the stuff from the start would be gone from the cache by the time I got to the end, so that the backup would basically be starting fresh? Thanks. Ross _______________________________________________ Ext3-users mailing list Ext3-users@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/ext3-users