[ ... ] >> Also note that in order to write 10^9 files at 10^3/s rate >> takes 10^6 seconds; roughly 10 days to populate the >> filesystem (or at least that to restore it from backups). > One thing that you can do when doing bulk loads of files (say, > during a restore or migration), is to use a two phase > write. First, write each of a batch of files (say 1000 files > at a time), then go back and reopen/fsync/close them. Why not just restore a database? >>> One layout for directories that works well with this kind of >>> thing is a time based one (say YEAR/MONTH/DAY/HOUR/MIN where >>> MIN might be 0, 5, 10, ..., 55 for example). >> As to the problem above and ths kind of solution, I reckon that >> it is utterly absurd (and I could have used much stronger words). > When you deal with systems that store millions of files, Millions of files may work; but 1 billion is an utter absurdity. A filesystem that can store reasonably 1 billion small files in 7TB is an unsolved research issue... The obvious thing to do is to use a database, and there is no way around this point. If one genuinely needs to store a lot of files, why not split them into many independent filesystems? A single large one is only need to allow for hard linking or for having a single large space pool, and in applications where the directory structure above makes any kind of sense that neither is usually required. > you pretty much always are going to use some kind of made up > directory layout. File systems are usually used for storing somewhat unstructured information, not records that can be looked up with a simple "YEAR/MONTH/DAY/HOUR/MIN" key, which seems very suitable for something like a simpel DBMS. There is even a tendency to move filesystems into databases, as they scale a lot better. And for cases where a filesystem still makes sense I would rather use, instead of the inane manylevel directory structure above, a file system design with proper tree indexes and perhaps even one with the ability to store small files into inodes. [ ... ] > You can always try to write 1 million files in a single > subdirectory, Again, I'd rather avoid anything like that. > but if you are writing your own application, using this kind > of scheme is pretty trivial. And an utter absurdity, for 1 billion files in 200k directories. Both on its own merits and compared to the OBVIOUS alternative. >> If anything, consider the obvious (obvious except to those >> who want to use a filesystem as a small record database), >> which is 'fsck' time, in particular given the structure of >> 'ext3' (or 'ext4') metadata. > fsck time has improved quite a lot recently with ext4 (and > with xfs). How many months do you think a 7TB filesystem with 1 billion files would take to 'fsck' even with those improvements? Even with the nice improvements? [ ... ] _______________________________________________ Ext3-users mailing list Ext3-users@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/ext3-users