Re: [PATCH v1] NFSv4.1 provide mount option to toggle trunking discovery

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Feb 24, 2022 at 1:20 PM Chuck Lever III <chuck.lever@xxxxxxxxxx> wrote:
>
>
> > On Feb 24, 2022, at 12:55 PM, Olga Kornievskaia <olga.kornievskaia@xxxxxxxxx> wrote:
> >
> > On Thu, Feb 24, 2022 at 10:30 AM Chuck Lever III <chuck.lever@xxxxxxxxxx> wrote:
> >>
> >>> On Feb 23, 2022, at 12:40 PM, Olga Kornievskaia <olga.kornievskaia@xxxxxxxxx> wrote:
> >>>
> >>> From: Olga Kornievskaia <kolga@xxxxxxxxxx>
> >>>
> >>> Introduce a new mount option -- trunkdiscovery,notrunkdiscovery -- to
> >>> toggle whether or not the client will engage in actively discovery
> >>> of trunking locations.
> >>
> >> An alternative solution might be to change the client's
> >> probe to treat NFS4ERR_DELAY as "no trunking information
> >> available" and then allow operation to proceed on the
> >> known good transport.
> >
> > I'm not sure what you mean about "the known good transport".
>
> The transport on which the client sent the
> GETATTR(fs_locations).
>
> The NFS4ERR_DELAY response means the server has no other
> trunks available "at this time."

But GETATTR(fs_locations) isn't only used for trunking query, it's
used for filesystem location (migration) as well. Are we redefining
what ERR_DELAY means in the context of trunking vs migration?

> > I don't
> > think the ERR_DELAY is associated with a transport. Btw, if you saw a
> > previous patch which restricts fs_location query to the main transport
> > makes your statement even more confusing as it would mean there is no
> > good transport. Or do you mean to say we should have trunking
> > discovery done asynchronous to mount by a separate kernel thread and
> > therefore not impact mount steps?
>
> Yes, something like that.
>
> Trunking discovery that is independent of the NFS mount
> process should be the goal. In fact, trunking discovery
> really ought to be done in user space.

I agree it should be a goal of continuous management of trunking but
the initial setup is a part of file system attributes discovery.
fs_location is a file system attribute which is queried along with
other attributes upon discovery of a file system. Thus I maintain that
the current treatment of trunking discovery is valid.

What is being described below is a set of goals for trunking that we
have discussed before and are important.

> - There is now a user/kernel API for managing transports
>
> - The trunking configuration on the server might change
>   during the lifetime of the mount, so periodic checking
>   is needed
>
> - Adding an extra round trip, especially one that might
>   be slowed by one or more NFS4ERR_DELAY replies, is
>   going to be a problem during a mount storm
>
> - There might be local policies that affect which network
>   paths to choose for trunking
>
> - The choice of transports might be made automatically
>   by an orchestrator
>
> - Tying this setting to a mount option is not appropriate
>   because the transports are shared amount multiple NFS
>   mounts
>
>
> > I do object to treating a single ERR_DELAY during discovery as a
> > permanent error as there are legitimate reasons to a delay in looking
> > up the information that can be resolved in time by the server.
> > However, I don't object to putting a time limit or number of tries on
> > ERR_DELAY as safety wheels.
>
> In the past, some have objected to /any/ delay added to
> the NFS mount process.

I again would like to note that fs_locations is a file system
attribute thus I would argue has to be treated as other file system
attributes.

> There's no reason to hold up the mount process -- the
> client can try the trunking discovery probe again in a
> few moments while the mount proceeds, can't it?

Given that I suggested doing it asynchronous means I consider it a
possible design though I think it increases the complexity of the
system greatly (I'm not convinced that it's the right call to be
done).

> If that means handing the probe to a work queue or
> leaving it to user space, that seems like a more
> flexible choice.
>
>
> > Lastly, I think perhaps we can do both have a mount option to toggle
> > discovery as well as safeguard the discovery from broken servers?
>
> I'd really rather not add a mount option for this
> purpose unless you know of another reason why trunking
> discovery needs to be disabled.

I don't offhand. I thought it is the simplest and most appropriate
solution and perhaps inline with "migration/nomigration" option but I
must be mistaken there.

> The best solution is to fix the server implementations.
> If that's not possible then the second best is to have
> the client manage the situation without needing any
> human intervention.
>
> Adding an administrative tunable is, to my mind, an
> option of the very last resort.
>
>
> --
> Chuck Lever
>
>
>



[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux