>> RHEL 5.3 >> ~1000.000.000 files (1-30k) >> ~7TB in total >> // >> I'm looking for a best practice when implementing this using >> EXT3 (or some other FS if it shouldn't do the job.). "best practice" would be a rather radical solution. >> On average the reads dominate (99%), writes are only used for >> updating and isn't a part of the service provided. The data >> is divided into 200k directories with each some 5k files. >> This ratio (dir/files) can be altered to optimize FS >> performance. > If you are writing to a local S-ATA disk, ext3/4 can write a > few thousand files/sec without doing any fsync() operations. > With fsync(), you will drop down quite a lot. Unfortunately using 'fsync' is a good idea for production systems. Also note that in order to write 10^9 files at 10^3/s rate takes 10^6 seconds; roughly 10 days to populate the filesystem (or at least that to restore it from backups). > One layout for directories that works well with this kind of > thing is a time based one (say YEAR/MONTH/DAY/HOUR/MIN where > MIN might be 0, 5, 10, ..., 55 for example). As to the problem above and ths kind of solution, I reckon that it is utterly absurd (and I could have used much stronger words). BTW, the sort of people who consider seriously such utter absurdities try to do a thorough job, and I don't want to know how the underlying storage system is structured :-). If anything, consider the obvious (obvious except to those who want to use a filesystem as a small record database), which is 'fsck' time, in particular given the structure of 'ext3' (or 'ext4') metadata. So: just don't use a filesystem as a database, spare us the horror; use a database, even a simple one, which is not utterly absurd. Compare these two: http://lists.gllug.org.uk/pipermail/gllug/2005-October/055445.html http://lists.gllug.org.uk/pipermail/gllug/2005-October/055488.html Anyhow I do see a lot of inane questions and "solutions" like the above in various lists (usually the XFS one, which attracts a lot of utter absurdities). > When reading files in ext3 (and ext4) or doing other bulk > operations like a large deletion, it is important to sort the > files by inode (do the readdir, get say all of the 5k files in > your subdir and then sort by inode before doing your bulk > operation). Good idea, but it is best to avoid the cases where this matters. _______________________________________________ Ext3-users mailing list Ext3-users@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/ext3-users