Re: unsharing tcp connections from different NFS mounts

Tom Talpey <tom@xxxxxxxxxx> · Wed, 7 Oct 2020 10:08:15 -0400

On 10/6/2020 5:26 PM, Igor Ostrovsky wrote:

On Tue, Oct 6, 2020 at 12:30 PM Bruce Fields <bfields@xxxxxxxxxxxx 
<mailto:bfields@xxxxxxxxxxxx>> wrote:

    On Tue, Oct 06, 2020 at 01:07:11PM -0400, Tom Talpey wrote:
     > On 10/6/2020 11:22 AM, Bruce Fields wrote:
     > >On Tue, Oct 06, 2020 at 11:20:41AM -0400, Chuck Lever wrote:
     > >>
     > >>
     > >>>On Oct 6, 2020, at 11:13 AM, bfields@xxxxxxxxxxxx
    <mailto:bfields@xxxxxxxxxxxx> wrote:
     > >>>
     > >>>NFSv4.1+ differs from earlier versions in that it always performs
     > >>>trunking discovery that results in mounts to the same server
    sharing a
     > >>>TCP connection.
     > >>>
     > >>>It turns out this results in performance regressions for some
    users;
     > >>>apparently the workload on one mount interferes with
    performance of
     > >>>another mount, and they were previously able to work around
    the problem
     > >>>by using different server IP addresses for the different mounts.
     > >>>
     > >>>Am I overlooking some hack that would reenable the previous
    behavior?
     > >>>Or would people be averse to an "-o noshareconn" option?
     > >>
     > >>I thought this was what the nconnect mount option was for.
     > >
     > >I've suggested that.  It doesn't isolate the two mounts from
    each other
     > >in the same way, but I can imagine it might make it less likely
    that a
     > >user on one mount will block a user on another?  I don't know,
    it might
     > >depend on the details of their workload and a certain amount of
    luck.
     >
     > Wouldn't it be better to fully understand the reason for the
     > performance difference, before changing the mount API? If it's
     > a guess, it'll come back to haunt the code for years.
     >
     > For example, maybe it's lock contention in the xprt transport code,
     > or in the socket stack.

    Yeah, I wonder too, and I don't have the details.

I've seen cases like this:

     dd if=/dev/zero of=/mnt/mount1/zeros &
     ls /mnt/mount2/

If /mnt/mount1 and /mnt/mount2 are NFS v3 mounts to the same server IP, 
the access to /mnt/mount2 can take a long time because the RPCs from "ls 
/mnt/mount2/" get stuck behind a bunch of the writes to /mnt/mount1. If 
/mnt/mount1 and /mnt/mount2 are different IPs to the same server, the 
accesses to /mnt/mount2 aren't impacted by the write workload on 
/mnt/mount1 (unless there is a saturation on the  server side, obviously).

This is plausible, and if so, I believe it indicates a credit/slot
shortage.

Does the client request more slots when it begins to share another
mount point on the connection? Does the server grant them, if so?

Tom.

It sounds like with NFS v4.1 trunking discovery, using separate IPs for 
the two mounts is no longer a sufficient workaround.
Igor