On Thu, Aug 19, 2010 at 01:03:06PM -0600, Andreas Dilger wrote: > > I think the problem isn't just the TOTAL amount of RAM being used, > but the fact that this piece of code is trying to do a SINGLE > allocation that is HUGE. The second problem is that the constant > re-allocation of this huge array (every 100 insertions) means that > it can never really exceed 1/2 of RAM in size. Part of the problem is that the data structures are optimized for relatively few hard links and a reasonable number of directories compared to other types of inodes. For people who are using hard-link backup schemes, these assumptions are violated in some fairly massive ways. Something that might be interesting to do is to keep some statistics in the superblock about the number of hard links. This would allow e2fsck to allocate the data structures appropriately up front, and maybe allow it to switch to some other representation if it turns out the file system is one which is dense with hard links. > That said, any insert-optimized tree structure with a high fan-out > would be suitable. Elements are almost never deleted, and we would > never need to compact the tree (it is freed as a whole when it is > done). We could try to create a tree-optimized data structure, but I wonder if it's worth it. IIRC berk_db has an in-memory and file-backed option, and it has a btree option. Using that might be easier than trying to hand-code something special. - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html