On Fri, Oct 22, 2021 at 10:41:27AM +0200, Jan Kara wrote: > On Thu 21-10-21 21:04:45, Phillip Susi wrote: > > > > Matthew Wilcox <willy@xxxxxxxxxxxxx> writes: > > > > > As far as I can tell, the following filesystems support compressed data: > > > > > > bcachefs, btrfs, erofs, ntfs, squashfs, zisofs > > > > > > I'd like to make it easier and more efficient for filesystems to > > > implement compressed data. There are a lot of approaches in use today, > > > but none of them seem quite right to me. I'm going to lay out a few > > > design considerations next and then propose a solution. Feel free to > > > tell me I've got the constraints wrong, or suggest alternative solutions. > > > > > > When we call ->readahead from the VFS, the VFS has decided which pages > > > are going to be the most useful to bring in, but it doesn't know how > > > pages are bundled together into blocks. As I've learned from talking to > > > Gao Xiang, sometimes the filesystem doesn't know either, so this isn't > > > something we can teach the VFS. > > > > > > We (David) added readahead_expand() recently to let the filesystem > > > opportunistically add pages to the page cache "around" the area requested > > > by the VFS. That reduces the number of times the filesystem has to > > > decompress the same block. But it can fail (due to memory allocation > > > failures or pages already being present in the cache). So filesystems > > > still have to implement some kind of fallback. > > > > Wouldn't it be better to keep the *compressed* data in the cache and > > decompress it multiple times if needed rather than decompress it once > > and cache the decompressed data? You would use more CPU time > > decompressing multiple times, but be able to cache more data and avoid > > more disk IO, which is generally far slower than the CPU can decompress > > the data. > > Well, one of the problems with keeping compressed data is that for mmap(2) > you have to have pages decompressed so that CPU can access them. So keeping > compressed data in the page cache would add a bunch of complexity. That > being said keeping compressed data cached somewhere else than in the page > cache may certainly me worth it and then just filling page cache on demand > from this data... It can be cached with a special internal inode, so no need to take care of the memory reclaim or migration by yourself. Otherwise, these all need to be take care of. For fixed-sized input compression, since they are reclaimed in page unit, so it won't be quite friendly since such data is all coupling. But for fixed-sized output compression, it's quite natural. Thanks, Gao Xiang > > Honza > -- > Jan Kara <jack@xxxxxxxx> > SUSE Labs, CR