On Wed, May 27, 2015 at 07:38:00PM -0400, Theodore Ts'o wrote: > On Wed, May 27, 2015 at 11:53:17AM -0700, Tom Marshall wrote: > > But one thing I'm wrestling with is how to be asynchronously notified when > > the lower readpage/readpages complete. The two ideas that come to mind are > > (1) plumbing a callback into mpage_end_io(), (2) allowing override of > > mpage_end_io() with a custom function, (3) creating kernel threads analogous > > to kblockd to wait for pending pages. > > Not all file systems use mpage_end_io(), so that's not a good > solution. > > You can do something like > > wait_on_page_bit(page, PG_uptodate); > > ... although to be robust you will also need to wake up if PG_error is > set (if there is an I/O error, PG_error is set instead of > PG_uptodate). So that means you'd have to spin your own wait function > using the waitqueue primitives and page_waitqueue(), using > __wait_on_bit() as an initial model. My current thought is to use a workqueue and queue a work object for each cluster read that is in flight. The work function does wait_on_page_locked for each lower page. If/when all complete without error, the data is then decompressed and the upper pages are entered. This is currently a work in progress, it's not yet functional. So basically my questions here are: (1) Does this seem sane? (2) Is it ok to wait on !locked instead of waiting on (uptodate|error)? (3) Do I need to mark a pending cluster pending to prevent simultaneous reads of the same cluster (especially since the cluster size will be much larger than the lower block size)? Also note I've now realized that the logical conclusion to the wrapper inode idea is a stacked filesystem. That's not the direction we are aiming for, so instead I'm adding a "struct xcomp_inode_info" to ext4_inode_info, which can then be passed back into the xcomp functions. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html