[RFC] Make ->readpage synchronous

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Ever since akpm introduced ->readpages() in 2002, it has been the
primary way to bring pages into cache.  Even though it's now been
replaced with ->readahead(), it's still the way to bring pages into
cache asynchronously.

Three of the four current callers of ->readpage rely on readahead to
bring pages into cache asynchronously and want synchronous semantics
from ->readpage.

generic_file_buffered_read():
                error = mapping->a_ops->readpage(filp, page);
                if (!PageUptodate(page)) {
                        error = lock_page_killable(page);

filemap_fault():
        error = mapping->a_ops->readpage(file, page);
        if (!error) {
                wait_on_page_locked(page);

do_read_cache_page():
                        err = mapping->a_ops->readpage(data, page);
                page = wait_on_page_read(page);

(if your brain isn't as deep into the page cache as mine is right now,
the page remains locked until I/O has completed, so all of these calls
wait for I/O to complete).

The one caller which (maybe?) wants async semantics is swap-over-NFS (the
SWP_FS case in swap_readpage()).  I'm not really familiar with the swap
code.  Should this switch to using ->direct_IO like __swap_writepage()?
Or ->read_iter() perhaps?

So the way is clear for everyone except NFS to start to move to having
a synchronous ->readpage().  I think we're all in favour of gradual
transitions.  My plan is to add a new return code from ->readpage called
AOP_UPDATED_PAGE (to rhyme with AOP_TRUNCATED_PAGE).  The semantics
are that the page has remained locked since ->readpage was called,
the necessary read I/Os have completed and PageUptodate is now true.

So why bother?  Better error handling.  If you do async readpage, the best
we can do is -EIO, because we only have one bit.  With a sync readpage,
the fs can return any error from ->readpage.  We can also stop using
the PageError bit for both read and write errors.  Which means we can
stop _clearing_ the PageError bit in the VFS before we call into the
filesystem, potentially losing the information that this page had a
write error.

So, to recap, in the new scheme, if you get an error while doing an async
read, leave the page !Uptodate, don't set PageError.  The VFS will notice
the page is !Uptodate (this can happen for a number of reasons, not just a
failed ->readahead) and call ->readpage().  At that point, your fs should
retry the read.  It can return whatever errno it likes at that point.
Or the read succeeds this time and you return AOP_UPDATED_PAGE to let
the VFS know you succeeded without unlocking the page.

You may get to see the AOP_UPDATED_PAGE return code appear soon ...
it solves an unrelated problem for me with the THP code (where the
requested page is Uptodate, but the entire THP is !Uptodate)



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux