On Thu, Jan 19, 2023 at 1:24 PM Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> wrote: > > > > > On Jan 19, 2023, at 12:50, Olga Kornievskaia <olga.kornievskaia@xxxxxxxxx> wrote: > > > > If the call to GETDEVICEINFO fails, the client fallback to doing IO > > to MDS but on every new IO call, the client tries to get the device > > again. Instead, mark the layout as unavailable as well. This way > > the client will re-try after a timeout period. > > > > Signed-off-by: Olga Kornievskaia <kolga@xxxxxxxxxx> > > --- > > fs/nfs/filelayout/filelayout.c | 1 + > > fs/nfs/pnfs.c | 7 +++++++ > > fs/nfs/pnfs.h | 2 ++ > > 3 files changed, 10 insertions(+) > > > > diff --git a/fs/nfs/filelayout/filelayout.c b/fs/nfs/filelayout/filelayout.c > > index 4974cd18ca46..13df85457cf5 100644 > > --- a/fs/nfs/filelayout/filelayout.c > > +++ b/fs/nfs/filelayout/filelayout.c > > @@ -862,6 +862,7 @@ fl_pnfs_update_layout(struct inode *ino, > > > > status = filelayout_check_deviceid(lo, fl, gfp_flags); > > if (status) { > > + pnfs_mark_layout_unavailable(lo, iomode); > > pnfs_put_lseg(lseg); > > lseg = NULL; > > } > > diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c > > index a5db5158c634..bac15dcf99bb 100644 > > --- a/fs/nfs/pnfs.c > > +++ b/fs/nfs/pnfs.c > > @@ -491,6 +491,13 @@ pnfs_layout_set_fail_bit(struct pnfs_layout_hdr *lo, int fail_bit) > > refcount_inc(&lo->plh_refcount); > > } > > > > +void > > +pnfs_mark_layout_unavailable(struct pnfs_layout_hdr *lo, enum pnfs_iomode fail_bit) > > +{ > > + pnfs_layout_set_fail_bit(lo, pnfs_iomode_to_fail_bit(fail_bit)); > > I suggest rather using pnfs_layout_io_set_failed() so that we also evict the layout segment that references this unrecognised deviceid. In fact, there is already an exported function pnfs_set_lo_fail() (which could definitely do with a better name!) that does this. I'm not opposed to this approach. In the proposed patch, I treated it as the layout being still valid but unavailable. My question is: I think we need to return the layout before doing this, correct? Should I be making changes to export something like pnfs_set_plh_return_info() or would adding a new function to pnfs.c that does pnfs_set_lo_fail and returns the layout? something like +void pnfs_invalidate_return_layout(struct pnfs_layout_segment *lseg) +{ + pnfs_layout_io_set_failed(lseg->pls_layout, lseg->pls_range.iomode); + pnfs_set_plh_return_info(lseg->pls_layout, lseg->pls_range.iomode, 0); +} +EXPORT_SYMBOL_GPL(pnfs_invalidate_return_layout); status = filelayout_check_deviceid(lo, fl, gfp_flags); if (status) { + pnfs_invalidate_return_layout(lseg); pnfs_put_lseg(lseg); lseg = NULL; > > > +} > > +EXPORT_SYMBOL_GPL(pnfs_mark_layout_unavailable); > > + > > static void > > pnfs_layout_clear_fail_bit(struct pnfs_layout_hdr *lo, int fail_bit) > > { > > diff --git a/fs/nfs/pnfs.h b/fs/nfs/pnfs.h > > index e3e6a41f19de..9f47bd883fc3 100644 > > --- a/fs/nfs/pnfs.h > > +++ b/fs/nfs/pnfs.h > > @@ -343,6 +343,8 @@ void pnfs_error_mark_layout_for_return(struct inode *inode, > > void pnfs_layout_return_unused_byclid(struct nfs_client *clp, > > enum pnfs_iomode iomode); > > > > +void pnfs_mark_layout_unavailable(struct pnfs_layout_hdr *lo, > > + enum pnfs_iomode iomode); > > /* nfs4_deviceid_flags */ > > enum { > > NFS_DEVICEID_INVALID = 0, /* set when MDS clientid recalled */ > > -- > > 2.31.1 > > > > _________________________________ > Trond Myklebust > Linux NFS client maintainer, Hammerspace > trond.myklebust@xxxxxxxxxxxxxxx >