Re: [PATCH v1] svcauth_gss: Close connection when dropping an incoming message

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Sep 12, 2016 at 11:57:13AM -0400, Chuck Lever wrote:
> Hi Bruce-
> 
> 
> > On Sep 9, 2016, at 5:18 PM, J. Bruce Fields <bfields@xxxxxxxxxxxx> wrote:
> > 
> > On Wed, Sep 07, 2016 at 04:36:19PM -0400, Chuck Lever wrote:
> >> S5.3.3.1 of RFC 2203 requires that an incoming GSS-wrapped message
> >> whose sequence number lies outside the current window is dropped.
> >> The rationale is:
> >> 
> >>  The reason for discarding requests silently is that the server
> >>  is unable to determine if the duplicate or out of range request
> >>  was due to a sequencing problem in the client, network, or the
> >>  operating system, or due to some quirk in routing, or a replay
> >>  attack by an intruder.  Discarding the request allows the client
> >>  to recover after timing out, if indeed the duplication was
> >>  unintentional or well intended.
> >> 
> >> However, clients may rely on the server dropping the connection to
> >> indicate that a retransmit is needed. Without a connection reset, a
> >> client can wait forever without retransmitting, and the workload
> >> just stops dead. I've reproduced this behavior by running xfstests
> >> generic/323 on an NFSv4.0 mount with proto=rdma and sec=krb5i.
> >> 
> >> To address this issue, have the server close the connection when it
> >> silently discards an incoming message due to a GSS sequence number
> >> problem.
> >> 
> >> Signed-off-by: Chuck Lever <chuck.lever@xxxxxxxxxx>
> >> Cc: Benjamin Coddington <bcodding@xxxxxxxxxx>
> >> ---
> >> Hi-
> >> 
> >> Passed testing with my reproducer: 10 runs of generic/323 with
> >> proto=rdma and sec=krb5i, with NFSv3, NFSv4.0, and NFSv4.1.
> >> generic/323 is 120 seconds or so of a heavy aio workload.
> >> 
> >> I tested with that dprintk replaced with pr_warn to confirm that the
> >> reproducer hits this path one or more times per test run.
> > 
> > Thanks, this is useful, but before applying I'd just like to audit other
> > uses of SVC_DROP in the server rpc code as this probably isn't the only
> > place with this problem.
> 
> Consider this a test result, then.
> 
> So, "I'd just like to audit" means you are doing the auditing now, or
> would you like me to dig into that?

I haven't looked at it, if you can that would be fantastic.

> > Also, this changes behavior for v2/v3 too, does that cause any problems?
> > Is it OK for the server to just always close connections on dropping in
> > the v2/v3 case too?
> 
> I've run the same tests with NFSv3 (NFS/RDMA + krb5i or krb5p) and did
> not see a negative impact. Not much, but there it is.
> 
> What would provide more confidence that NFSv2/3 is not impacted?

I guess I'm not too worried.  Surely NFSv3 clients have always had to
handle reconnecting connections closed by the server.

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux