Re: [PATCH 1/1] [SUNRPC] make sure to clone timeout values

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, May 22, 2018 at 3:56 PM, Trond Myklebust
<trondmy@xxxxxxxxxxxxxxx> wrote:
> On Tue, 2018-05-22 at 15:28 -0400, Olga Kornievskaia wrote:
>> On Tue, May 22, 2018 at 3:03 PM, Trond Myklebust
>> <trondmy@xxxxxxxxxxxxxxx> wrote:
>> > On Tue, 2018-05-22 at 14:40 -0400, Olga Kornievskaia wrote:
>> > > From: Olga Kornievskaia <olga.kornievskaia@xxxxxxxxx>
>> > >
>> > > For pNFS, the operations to DS currently timeout in 10s.
>> > > According
>> > > to the spec, the client must not be re-trying an NFSv4.1
>> > > operation
>> > > unless the connection was broken.
>> > >
>> > > Signed-off-by: Olga Kornievskaia <kolga@xxxxxxxxxx>
>> > > ---
>> > >  net/sunrpc/clnt.c | 1 +
>> > >  1 file changed, 1 insertion(+)
>> > >
>> > > diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
>> > > index 6e432ec..97517eb 100644
>> > > --- a/net/sunrpc/clnt.c
>> > > +++ b/net/sunrpc/clnt.c
>> > > @@ -668,6 +668,7 @@ struct rpc_clnt *
>> > >               .prognumber     = clnt->cl_prog,
>> > >               .version        = clnt->cl_vers,
>> > >               .authflavor     = flavor,
>> > > +             .timeout        = clnt->cl_timeout,
>> > >       };
>> > >       return __rpc_clone_client(&args, clnt);
>> > >  }
>> >
>> > What does this patch have to do with pNFS? That's the generic RPC
>> > client cloning API you are changing.
>> >
>> > The pNFS/files timeouts are intended to be set using the
>> > dataserver_retrans and dataserver_timeo module parameters described
>> > at
>> > the bottom of fs/nfs/filelayout/filelayoutdev.c
>>
>> Ok so perhaps the code needs to re-written so that it allows for the
>> DS to get an rpc client with its timeouts set. Which currently
>> doesn't
>> happen.
>>
>> From what I could tell the DS code tries to set the timeout values in
>> nfs4_set_ds_client() but that has no effect.
>>
>> nfs4_find_or_create_ds_client() calls rpc_clone_client_set_auth()
>> which creates an rpc client but the timeout that were set are ignored
>> and instead the rpc client is getting created with this 10s timeout.
>>
>> (but I thought that in general it made sense that a clone also copies
>> the timeout values)
>>
>
> It does not make sense when you consider that the timeout is a per-
> transport attribute.
>
> FWIW, I've no idea where this 10s timeout you are seeing is coming
> from. Perhaps it is worthwhile figuring that out first?

Besides the value of the 10s (which I also have been having a really
hard time figuring out) it's also the max timeout and the fact that,
after  the 10s are up it's giving up and failing the operation which
is then re-tried against the MDS. This shouldn't happen. So I felt
like even if that value was 60s, it shouldn't have timed out after 60s
and re-tried (without the fix that I'm proposing).

I'll give it a bit more to figure out where 10s is coming from.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux