tytso@xxxxxxx wrote: > 1) It's not just about storage efficiency, but also about transfer > efficiency. Disk drives generally like to transfer hunks of data in > 16k to 64k at a time. So getting related pieces of small hunks of > data read at the same time, we can win big on performance. BUT, it's > extremely hard to do this at the filesystem level, since the > application is much more likely to know which micro-file of 16 bytes > is likely to be needed at the same time as some other micro-file which > is only 16 bytes long. Most filesystems (as you'll know) use locality of reference to cluster files. >From the studies I've seen it works quite well. When I added tail-end packing to SquashFS, I looked into various stategies to determine which tail-ends (fragments) to pack together. As SquashFS is a read-only filesystem this can be done using off-line analysis. After evaluating various strategies (best fit, first fit, same-size etc.) I found the best compression of these packed tail-ends was achieved by packing small files together in alphabetical order from the same directory. Such packing also achieved the highest performance improvements reading from CDROM (Squashfs is used for LiveCDs, and so changes in file placement can have a dramatic affect on seeking). This was a result which was interesting from my POV because it confirmed conventional locality of reference wisdom. Phillip Lougher - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html