On Mon, 2021-06-28 at 17:12 -0400, David Wysochanski wrote: > On Mon, Jun 28, 2021 at 3:09 PM Trond Myklebust > <trondmy@xxxxxxxxxxxxxxx> wrote: > > > > On Mon, 2021-06-28 at 13:39 -0400, Dave Wysochanski wrote: > > > Earlier commits refactored some NFS read code and removed > > > nfs_readpage_async(), but neglected to properly fixup > > > nfs_readpage_from_fscache_complete(). The code path is > > > only hit when something unusual occurs with the cachefiles > > > backing filesystem, such as an IO error or while a cookie > > > is being invalidated. > > > > > > Signed-off-by: Dave Wysochanski <dwysocha@xxxxxxxxxx> > > > --- > > > fs/nfs/fscache.c | 14 ++++++++++++-- > > > 1 file changed, 12 insertions(+), 2 deletions(-) > > > > > > diff --git a/fs/nfs/fscache.c b/fs/nfs/fscache.c > > > index c4c021c6ebbd..d308cb7e1dd4 100644 > > > --- a/fs/nfs/fscache.c > > > +++ b/fs/nfs/fscache.c > > > @@ -381,15 +381,25 @@ static void > > > nfs_readpage_from_fscache_complete(struct page *page, > > > void *context, > > > int error) > > > { > > > + struct nfs_readdesc desc; > > > + struct inode *inode = page->mapping->host; > > > + > > > dfprintk(FSCACHE, > > > "NFS: readpage_from_fscache_complete > > > (0x%p/0x%p/%d)\n", > > > page, context, error); > > > > > > - /* if the read completes with an error, we just unlock > > > the > > > page and let > > > - * the VM reissue the readpage */ > > > if (!error) { > > > SetPageUptodate(page); > > > unlock_page(page); > > > + } else { > > > + desc.ctx = context; > > > + nfs_pageio_init_read(&desc.pgio, inode, false, > > > + > > > &nfs_async_read_completion_ops); > > > + error = readpage_async_filler(&desc, page); > > > + if (error) > > > + return; > > > > This code path can clearly fail too. Why can we not fix this code > > to > > allow it to return that reported error so that we can handle the > > failure case in nfs_readpage() instead of dead-ending here? > > > > Maybe the below patch is what you had in mind? That way if fscache > is enabled, nfs_readpage() should behave the same way as if it's not, > for the case where an IO error occurs in the NFS read completion > path. > > If we call into fscache and we get back that the IO has been > submitted, > wait until it is completed, so we'll catch any IO errors in the read > completion > path. This does not solve the "catch the internal errors", IOW, the > ones that show up as pg_error, that will probably require copying > pg_error into nfs_open_context.error field. > > diff --git a/fs/nfs/read.c b/fs/nfs/read.c > index 78b9181e94ba..28e3318080e0 100644 > --- a/fs/nfs/read.c > +++ b/fs/nfs/read.c > @@ -357,13 +357,13 @@ int nfs_readpage(struct file *file, struct page > *page) > } else > desc.ctx = > get_nfs_open_context(nfs_file_open_context(file)); > > + xchg(&desc.ctx->error, 0); > if (!IS_SYNC(inode)) { > ret = nfs_readpage_from_fscache(desc.ctx, inode, > page); > if (ret == 0) > - goto out; > + goto out_wait; > } > > - xchg(&desc.ctx->error, 0); > nfs_pageio_init_read(&desc.pgio, inode, false, > &nfs_async_read_completion_ops); > > @@ -373,6 +373,7 @@ int nfs_readpage(struct file *file, struct page > *page) > > nfs_pageio_complete_read(&desc.pgio); > ret = desc.pgio.pg_error < 0 ? desc.pgio.pg_error : 0; > +out_wait: > if (!ret) { > ret = wait_on_page_locked_killable(page); > if (!PageUptodate(page) && !ret) > > > > > > > + > > > + nfs_pageio_complete_read(&desc.pgio); > > > } > > > } > > > > > > > -- > > Trond Myklebust > > Linux NFS client maintainer, Hammerspace > > trond.myklebust@xxxxxxxxxxxxxxx > > > > > Yes, please. This avoids that duplication of NFS read code in the fscache layer. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx