On Sunday 18 May 2003 18:01, Michael Harris wrote: > Just to add, I can attest that moving the files from the old dir to the new > as described improves performance on my machines dramatically. In our > service we end up with directories of 150k+ files which are generally > touched only as they are added, though every file will be touched several > times over a month. The files are each around 50kB. When the directory > entry gets to be about 4MB it begins to take a long time for remote > machines to copy files into the directory, maybe 4 seconds for a 50kB file > on a switched 100 base network. The performance hit is worst with remote > machines using SMB. Compressing the directory entry with mkdir new > cp old/* new/ > rm -rf old > mv old new > definitely improves things, but generally when there gets to be more than > 200k files we have to roll over to a new directory to keep things moving. I > suspect the remote machines are effectively downloading the directory entry > with each copy to the server, but I also see the smbd tasks pegging on the > server as well, but never really investigated it. We see this with ext2 and > ext3. Not really looking for a solution here but just offering the info, > but if anyone has a quick fix please share it. I may try resiserfs someday > but for now we just use thousands of directories for the files. > Mike which way do you normaly use to push the files when you don't use smb? > > > "Alan R.Becker" <beckera@mail-now.com> wrote: > > > (1) Is the assumption that directories don't compress when deleting > > > files correct? How is this handled (in general terms)? > > > > That is correct. A deleted file leaves a "hole" in the directory > > which a new addition can fill (if it fits). > > > > > (2) Is there any difference between ext2 and ext3? > > > > No. > > > > > (3) Does the htree code change the picture any (even > > > though I don't use it, and won't until it is production) ? > > > > No, htree will not release directory blocks. > > > > > (4) Is it possible that the directories themselves > > > were fragmented? > > > > Yes, very probable. > > > > However to understand why things slowed down a bit more info is needed. > > > > It is probable that the many little files in one typical directory are > > splattered all over the disk. Does your workload regularly touch all the > > file in these directories? If so then it maybe suffering from this lack > > of inter-file locality. > > > > If not then yes, perhaps the problem is due to large, fragmented > > directories. > > > > How many bytes does a typical directory consume? If you have the disk > > space, and are confident that (say) 64k is "enough" then perhaps you > > could grow each user's mail directory to (say) 64k when that user is > > created. This way they will have a nice unfragmented directory for all > > time. > > > > > (5) After doing a "mkdir" to create a new directory, how many > > > file entries can it hold before it would be expanded to accept > > > another file? > > > > 4 kilobytes. Each directory entry consumes eight bytes, plus the length > > of the name rounded up to a multiple of 4 bytes. > > > > > When a directory is expanded, how many additional > > > file entries can be stored before needing another expansion? > > > > Another 4 kilobytes. > > > > > (6) Say I have a directory containing some files, then I delete > > > some files, and finally I start adding files. Will new file > > > entries use empty or vacated directory slots before expanding > > > the directory? > > > > Deletion causes holes. Holes are coalesced within a 4k block. Holes are > > allocated from on a first-fit basis. > > > > > (7) I am aware of e2defrag (latest version I have found is 0.73). > > > Does this program (or any other any tool) perform any > > > directory optimization that would affect this problem? > > > > It's obsolete. > > > > For your purposes, all you'd need to do to defrag a directory is > > > > mkdir new > > ln old/* new/ > > rm -rf old > > mv old new > > > > If you use `cp' instead of `ln' then you'll defrag the files themselves, > > and lay them out close to each other. Which is only important if you app > > regularly touches lots of files in a single directory. It probably does > > not.. > > > > > (8) If e2defrag would be helpful, has it/is it being brought > > > forward to operate correctly with current (RH 8/9) systems? > > > I see some warnings about blocksise restrictions, etc. > > > > I haven't heard of anyone using it in ages. > > > > > (9) In designing new systems, are there some useful guidelines > > > about the maximum number of files that can exist in a single > > > directory without significant performance loss? > > > I am interested in ext2, ext3, and htree. > > > > Non-htree gets awkward at a few thousand. htree appears to be OK up to > > hundreds of thousands. Its practical scalability is unknown, really. > > > > > > _______________________________________________ > > > > Ext3-users@redhat.com > > https://www.redhat.com/mailman/listinfo/ext3-users > > _______________________________________________ > > Ext3-users@redhat.com > https://www.redhat.com/mailman/listinfo/ext3-users -- e-admin internet gmbh Andreas Gietl tel +49 941 3810884 Ludwig-Thoma-Strasse 35 fax +49 89 244329104 93051 Regensburg mobil +49 171 6070008 PGP/GPG-Key unter http://www.e-admin.de/gpg.html _______________________________________________ Ext3-users@redhat.com https://www.redhat.com/mailman/listinfo/ext3-users