Re: nfsd thread limit and UDP ?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Feb 21, 2019 at 12:35:46PM +0000, James Pearson wrote:
> On Thu, 21 Feb 2019 at 04:18, J. Bruce Fields <bfields@xxxxxxxxxxxx> wrote:
> >
> > On Wed, Feb 20, 2019 at 11:28:53AM +0000, James Pearson wrote:
> > > On a very busy NFSv3 server (running CentOS 6), we recently upped the
> > > nfsd thread count to 1024 - but this caused client mount requests over
> > > UDP to fail.
> > >
> > > We configure all our clients to use TCP for NFS mounts, but the
> > > automounter (automountd) on MacOS (up to version MacOS 10.12) seeds a
> > > 'null call' to the NFS server over UDP before attempting the mount -
> > > but the server appears to ignore any UDP requests - and the automount
> > > fails
> >
> > By the way, you might also just turn off UDP.  (Start run rpc.nfsd with
> > the -U option.)  Hopefully MacOS can handle that case.
> 
> We tried that - but when we restarted nfs, some existing mounts hung
> (not sure why, as we should be just using TCP everywhere) ... although
> when tested on a test server, the MacOS automounter worked fine

It's probably not a good idea to turn off UDP while there are existing
mounts, even if the mounts are supposedly TCP.  At a guess, maybe some
one of the sideband protocols (NLM or NSM) is using UDP and that's
causing problems.

> I tried your patch - it doesn't apply 'as is' on a CentOS 6 kernel -
> but with a bit of manual hacking, I can get it to fit

Whoops, I missed at first that you were on an older kernel.

> However, the net/sunrpc/svcsock.c in these kernels has an extra call
> to svc_sock_setbufsize() :
> 
>         /* Initialize the socket */
>         if (sock->type == SOCK_DGRAM)
>                 svc_udp_init(svsk, serv);
>         else {
>                 /* initialise setting must have enough space to
>                  * receive and respond to one request.
>                  */
>                 svc_sock_setbufsize(svsk->sk_sock, 4 * serv->sv_max_mesg,
>                                         4 * serv->sv_max_mesg);
>                 svc_tcp_init(svsk, serv);
>         }
> 
> I tried replacing that svc_sock_setbufsize() with:
> 
>                 svc_sock_setbufsize(svsk, 4);
> 
> but that just caused the whole machine to lock up shortly after
> sunrpc.ko was loaded ...

Looks like it's trying to dereference svsk->xpt_server before
svc_tcp_init() has initialized it.

> However, things seem to work fine if I call a copy of the original
> svc_sock_setbufsize() at that point in the code with the original args
> ...
> 
> i.e. mounts over UDP (and MacOS automounts) now work with nfsd threads
> over 1017 (I tried 2048 ... and it worked)

OK, I think that's evidence enough that this overflow was the problem
you were hitting, so I'll send that patch upstream.

> Incidentally, I came across an old thread on this list that appears to
> be related to this issue (well, it mentions a 1020 thread limit and
> buffer size wraps in svc_sock_setbufsize() ???) :
> 
>  https://www.spinics.net/lists/linux-nfs/msg34927.html
> 
> ... but I'm not sure what the result of that was (nor if it is
> actually related to the issue here) ?

Yeah, see https://www.spinics.net/lists/linux-nfs/msg34932.html.  So, I
knew about this problem and even made a patch before and then somehow
dropped it.  I'm not sure how that happened.  Anyway, I have it queued
up for 5.1 now, so that shouldn't happen again.

--b.



[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux