On Thu, Nov 21, 2024 at 10:11:33AM -0500, Benjamin Coddington wrote: > On 20 Nov 2024, at 10:22, Chuck Lever wrote: > > > On Wed, Nov 20, 2024 at 09:09:35AM -0500, Benjamin Coddington wrote: > >> If we're unable to register a SCSI device, ensure we mark the device as > >> unavailable so that it will timeout and be re-added via GETDEVINFO. This > >> avoids repeated doomed attempts to register a device in the IO path. > >> > >> Add some clarifying comments as well. > >> > >> Fixes: d869da91cccb ("nfs/blocklayout: Fix premature PR key unregistration") > >> Signed-off-by: Benjamin Coddington <bcodding@xxxxxxxxxx> > >> Reviewed-by: Christoph Hellwig <hch@xxxxxx> > >> --- > >> fs/nfs/blocklayout/blocklayout.c | 12 +++++++++++- > >> 1 file changed, 11 insertions(+), 1 deletion(-) > >> > >> diff --git a/fs/nfs/blocklayout/blocklayout.c b/fs/nfs/blocklayout/blocklayout.c > >> index 0becdec12970..b36bc2f4f7e2 100644 > >> --- a/fs/nfs/blocklayout/blocklayout.c > >> +++ b/fs/nfs/blocklayout/blocklayout.c > >> @@ -571,19 +571,29 @@ bl_find_get_deviceid(struct nfs_server *server, > >> if (!node) > >> return ERR_PTR(-ENODEV); > >> > >> + /* > >> + * Devices that are marked unavailable are left in the cache with a > >> + * timeout to avoid sending GETDEVINFO after every LAYOUTGET, or > >> + * constantly attempting to register the device. Once marked as > >> + * unavailable they must be deleted and never reused. > >> + */ > >> if (test_bit(NFS_DEVICEID_UNAVAILABLE, &node->flags)) { > >> unsigned long end = jiffies; > >> unsigned long start = end - PNFS_DEVICE_RETRY_TIMEOUT; > >> > >> if (!time_in_range(node->timestamp_unavailable, start, end)) { > >> + /* Force a new GETDEVINFO for this LAYOUT */ > > > > Or perhaps: "Uncork subsequent GETDEVINFO operations for this device" > > <shrug> > > Sure, ok! > > >> nfs4_delete_deviceid(node->ld, node->nfs_client, id); > >> goto retry; > >> } > >> goto out_put; > >> } > >> > >> - if (!bl_register_dev(container_of(node, struct pnfs_block_dev, node))) > >> + /* If we cannot register, treat this device as transient */ > > > > How about "Make a negative cache entry for this device" > > Hmm - that's closer to the dentry language rather than how we refer to > temporary error cases in device land. For me the "transient" has some > hopeful meaning as in we expect this might work in the future - but I'm ok > changing this comment. There will be some NFS clients that might try to do > pNFS SCSI but will never actually have the devices locally, and so that's > not a "transient" situation. This can only fixed today with export policy. No big deal! > >> + if (!bl_register_dev(container_of(node, struct pnfs_block_dev, node))) { > >> + nfs4_mark_deviceid_unavailable(node); > >> goto out_put; > >> + } > >> > >> return node; > >> > >> -- > >> 2.47.0 > >> > > > > It took me a bit to understand what this patch does. It is like > > setting up a negative dentry so the local device cache absorbs > > bursts of checks for the device. OK. > > Yes, its like the layout error handling, but for devices. > > Its not obvious at this layer, but every IO wants to do LAYOUTGET, then > figure out which device GETDEVINFO, then here we need to prep the device > with a reservation. Its a lot of slow work that makes a mess of IO > latencies if one of the later steps is going to fail for awhile. Thanks! It would be nice to add this bit of context to the patch description. > > Just an observation: Negative caching has some consequences too. > > For instance, there will now be a period where, if the device > > happens to become available, the layout is still unusable. I wonder > > if that's going to have some undesirable operational effects. > > It sure does, but I don't think there's a simple way to get notified that a > SCSI device has re-appeared or has started supporting persistent > reservations. Reviewed-by: Chuck Lever <chuck.lever@xxxxxxxxxx> -- Chuck Lever