Re: fs compression

Tom Marshall <tom@xxxxxxxxx> · Thu, 28 May 2015 17:18:31 -0700

So I've just gotten this all working.  The last notable change I made was to
inode size: I added an i_compressed_size member and then did some macro
hackage to ensure that the fs implementation (eg. fs/ext4) sees the
compressed size while everything else sees the uncompressed size.

I'll be testing further tomorrow.

On Thu, May 28, 2015 at 01:55:15PM -0700, Tom Marshall wrote:
> On Wed, May 27, 2015 at 07:38:00PM -0400, Theodore Ts'o wrote:
> > On Wed, May 27, 2015 at 11:53:17AM -0700, Tom Marshall wrote:
> > > But one thing I'm wrestling with is how to be asynchronously notified when
> > > the lower readpage/readpages complete.  The two ideas that come to mind are
> > > (1) plumbing a callback into mpage_end_io(), (2) allowing override of
> > > mpage_end_io() with a custom function, (3) creating kernel threads analogous
> > > to kblockd to wait for pending pages.
> > 
> > Not all file systems use mpage_end_io(), so that's not a good
> > solution.
> > 
> > You can do something like
> > 
> > 	wait_on_page_bit(page, PG_uptodate);
> > 
> > ... although to be robust you will also need to wake up if PG_error is
> > set (if there is an I/O error, PG_error is set instead of
> > PG_uptodate).  So that means you'd have to spin your own wait function
> > using the waitqueue primitives and page_waitqueue(), using
> > __wait_on_bit() as an initial model.
> 
> My current thought is to use a workqueue and queue a work object for each
> cluster read that is in flight.  The work function does wait_on_page_locked
> for each lower page.  If/when all complete without error, the data is then
> decompressed and the upper pages are entered.  This is currently a work in
> progress, it's not yet functional.
> 
> So basically my questions here are:
> 
> (1) Does this seem sane?
> 
> (2) Is it ok to wait on !locked instead of waiting on (uptodate|error)?
> 
> (3) Do I need to mark a pending cluster pending to prevent simultaneous
>     reads of the same cluster (especially since the cluster size will be
>     much larger than the lower block size)?
> 
> Also note I've now realized that the logical conclusion to the wrapper inode
> idea is a stacked filesystem.  That's not the direction we are aiming for,
> so instead I'm adding a "struct xcomp_inode_info" to ext4_inode_info, which
> can then be passed back into the xcomp functions.
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html