On Wed, 2007-10-10 at 23:37 -0700, Ross Boylan wrote: > On Wed, 2007-10-10 at 09:59 -0600, Andreas Dilger wrote: > > On Oct 06, 2007 00:10 -0700, Ross Boylan wrote: > > > My last full backup of my Cyrus mail spool had 1,393,569 files and > > > cconsumed about 4G after compression. It took over 13 hours. Some > > > investigation led to the following test: > > > time tar cf /dev/null /var/spool/cyrus/mail/r/user/ross/debian/user/ > > > > FYI - "tar cf /dev/null" actually skips reading any file data. The > > code special cases /dev/null and skips the read entirely. > > > > > That took 15 minutes the first time it ran, and 32 seconds when run > > > immediately thereafter. There were 355,746 files. This is typical of > > > what I've been seeing: initial run is slow; later runs are much faster. > > > > I'd expect this is because on the initial run the on-disk inode ordering > > causes a lot of seeks, and later runs come straight from memory. Probably > > not a lot you can do directly, but e.g. pre-reading the inode table would > > be a good start. > Judging from your comments and the thread you reference below, the > problem is that the order returned from readdir is not inode order. But > if tar, in this special case (/dev/null), doesn't actually read from the > file, why should it be so slow. Does it do something (stat?) that makes > it have to fetch the inode anyway? > > > > > > > I found some earlier posts on similar issues, although they mostly > > > concerned apparently empty directories that took a long time. Theodore > > > Tso had a comment that seemed to indicate that hashing conflicts with > > > Unix requirements. I think the implication was that you could end up > > > with linearized, or partly linearized searches under some scenarios. > > > Since this is a mail spool, I think it gets lots of sync()'s. > > > > There was an LD_PRELOAD library that Ted wrote that may also help: > > http://marc.info/?l=mutt-dev&m=107226330912347&w=2 > > > I got the code, but am not having much luck making it work. I've tried > various things. The most recent is > cc -shared -fpic -o libsd_readdir.so spd_readdir.c # as me > # rest as root > # export LD_LIBRARY_PATH=./ > # export LD_PRELOAD=libsd_readdir.so > # ldconfig -v -n $(pwd) > /usr/local/src/kernel/ext3-patch: > libsd_readdir.so -> libsd_readdir.so > corn:/usr/local/src/kernel/ext3-patch# date; time tar > cf /dev/null /var/spool/cyrus/mail/r/user/ross/pol/asdnet/ > Wed Oct 10 23:16:44 PDT 2007 > tar: Removing leading `/' from member names > Segmentation fault Even stranger, when I try the same thing with a little test program that calls readdir, it works. I tried running tar as myself, but got the same segfault (the first test I reported I ran as root). tar doesn't look as if it's setuid # ls -l /bin/tar -rwxr-xr-x 1 root root 231188 2007-09-05 02:42 /bin/tar > > I don't know how to make something for preload; can anyone give any > hints? > > Should the module I'm attempting to load have any effect on the 15 > minute time noted above for tar to /dev/null, or is it only relevant if > I am pulling data off the disk files? > > Would there be any value in having some other program traverse the > directories before I do the backup, or would cache limits likely mean > the stuff from the start would be gone from the cache by the time I got > to the end, so that the backup would basically be starting fresh? > > > Thanks. > Ross _______________________________________________ Ext3-users mailing list Ext3-users@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/ext3-users