Re: Readahead for compressed data

Gao Xiang <hsiangkao@xxxxxxxxxxxxxxxxx> · Fri, 22 Oct 2021 17:11:34 +0800

On Fri, Oct 22, 2021 at 10:41:27AM +0200, Jan Kara wrote:
> On Thu 21-10-21 21:04:45, Phillip Susi wrote:
> > 
> > Matthew Wilcox <willy@xxxxxxxxxxxxx> writes:
> > 
> > > As far as I can tell, the following filesystems support compressed data:
> > >
> > > bcachefs, btrfs, erofs, ntfs, squashfs, zisofs
> > >
> > > I'd like to make it easier and more efficient for filesystems to
> > > implement compressed data.  There are a lot of approaches in use today,
> > > but none of them seem quite right to me.  I'm going to lay out a few
> > > design considerations next and then propose a solution.  Feel free to
> > > tell me I've got the constraints wrong, or suggest alternative solutions.
> > >
> > > When we call ->readahead from the VFS, the VFS has decided which pages
> > > are going to be the most useful to bring in, but it doesn't know how
> > > pages are bundled together into blocks.  As I've learned from talking to
> > > Gao Xiang, sometimes the filesystem doesn't know either, so this isn't
> > > something we can teach the VFS.
> > >
> > > We (David) added readahead_expand() recently to let the filesystem
> > > opportunistically add pages to the page cache "around" the area requested
> > > by the VFS.  That reduces the number of times the filesystem has to
> > > decompress the same block.  But it can fail (due to memory allocation
> > > failures or pages already being present in the cache).  So filesystems
> > > still have to implement some kind of fallback.
> > 
> > Wouldn't it be better to keep the *compressed* data in the cache and
> > decompress it multiple times if needed rather than decompress it once
> > and cache the decompressed data?  You would use more CPU time
> > decompressing multiple times, but be able to cache more data and avoid
> > more disk IO, which is generally far slower than the CPU can decompress
> > the data.
> 
> Well, one of the problems with keeping compressed data is that for mmap(2)
> you have to have pages decompressed so that CPU can access them. So keeping
> compressed data in the page cache would add a bunch of complexity. That
> being said keeping compressed data cached somewhere else than in the page
> cache may certainly me worth it and then just filling page cache on demand
> from this data...

It can be cached with a special internal inode, so no need to take
care of the memory reclaim or migration by yourself.

Otherwise, these all need to be take care of. For fixed-sized input
compression, since they are reclaimed in page unit, so it won't be
quite friendly since such data is all coupling. But for fixed-sized
output compression, it's quite natural.

Thanks,
Gao Xiang

> 
> 								Honza
> -- 
> Jan Kara <jack@xxxxxxxx>
> SUSE Labs, CR