J. R. Okajima wrote:
Hello Phillip, I've found an intersting issue about squashfs. Please give me a guidance or an advise. In short: Why does squashfs read and decompress the same block several times? Is the nested fs-image always better for squashfs?
Hi Junjiro, What I think you're seeing here is the negative effect of fragment blocks (tail-end packing) in the native squashfs example and the positive effect of vfs/loop block caching in the ext3 on squashfs example. In the native Squashfs example (a.img) multiple files are packed into fragment blocks and then compressed together to get better compression. Fragment blocks are efficient at getting better compression, but are highly CPU decompress inefficient on very random file access patterns. It is easy to see why - access of file X (of say 5K) packed into a 128K fragment containing files W, X ,Y and Z requires the decompression of the entire 128K fragment. If the file access pattern is extremely random (i.e. it doesn't exhibit any locality of reference), none of the other files are accessed before the decompressed fragment is evicted from the fragment cache. So, when the other files are read the fragment needs to be read and decompressed a second (and third etc.) time. As I said fragments are good at getting better compression, and are CPU efficient in normal file access patterns which exhibit locality of reference - files in the same directory (and hence are in the same fragment) tend to be accessed at the same time. In the above example it means once the fragment has been decompressed for file X, the other files W, Y and Z will likely to be read soon afterwards and will find the already decompressed fragment in the fragment cache. Indeed for this scenario fragments actually improve I/O performance and reduce CPU overhead. In the ext3 on squashfs example you're seeing the effect of VFS/loopback caching of the decompressed data from Squashfs. In this example the ext3 file is stored as a single file inside Squashfs compressed in blocks of 128K. In normal operation the mounted ext3 filesystem will issue 4K block reads to the loopback file which will cause the underlying 128K compressed block to be read from the Squashfs file, and this decompressed data will go into the VFS page cache. With random file access of the ext3 file system only a small part of that 128K block may be required at this time, however, much later ext3 filesystem accesses requiring that 128K block will be satisfied from the page cache still holding the decompressed block, and so this later access *won't* go to Squashfs wand there will not be a second decompress of the block. The overall effect means worse performance for Squashfs. But this is rather unsurprising due to the negative effects of fragments and the positive effect of VFS caching in the case of ext3 on Squashfs. As I said I suspect the major cause of worse performance for Squashfs is fragments coupled with your very random file access pattern. Fragments simply do not work well with atypical random access patterns. If you expect random access you should *not* use fragments and specify the -no-fragments option to Mksquashfs. The default Mksquashfs options (duplicate detection, fragments, 128K blocks) are a compromise which give high compression coupled with fast I/O for typical file access patterns. However, people should *always* play around with the defaults as the different compression and I/O performance achieved may suit their needs better. Unfortunately very few people do so, which is a shame as I often see people complaining about various aspects of Squashfs which I know would be solved if they'd only use different Mksquashfs settings. Regards Phillip -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html