> On Jun 20, 2024, at 10:15 AM, Christoph Hellwig <hch@xxxxxx> wrote: > > On Thu, Jun 20, 2024 at 09:52:59AM -0400, Benjamin Coddington wrote: >> On 20 Jun 2024, at 1:06, Christoph Hellwig wrote: >> >>> On Wed, Jun 19, 2024 at 01:39:33PM -0400, cel@xxxxxxxxxx wrote: >>>> - if (test_bit(NFS_DEVICEID_UNAVAILABLE, &node->flags) == 0) >>>> + if (test_bit(NFS_DEVICEID_UNAVAILABLE, &node->flags) == 0) { >>> >>> It might be worth to invert this and keep the unavailable handling in >>> the branch as that's the exceptional case. That code is also woefully >>> under-documented and could have really used a comment. >> >> The transient device handling in general, or just this bit of it? > > Basically the code behind this NFS_DEVICEID_UNAVAILABLE check here. > >>>> + if (d->pr_reg) >>>> + if (d->pr_reg(d) < 0) >>>> + goto out_put; >>> >>> Empty line after variable declarations. Also is there anything that >>> synchronizes the lookups here so that we don't do multiple registrations >>> in parallel? >> >> I don't think there is. Do we get an error if we register twice? > > Yes. That's the basically the same condition as the one that made > Chuck create this series. No, the problem I saw was that the device was unregistered early, and that resulted in I/O errors and falling back to the MDS. pr_register() doesn't fail if the device is already registered (at least when the key is the same). -- Chuck Lever