Re: NFSv4.1/4.2 server returns same sessionid after DestroySession/CreateSession

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: Chuck Lever III <chuck.lever@xxxxxxxxxx> wrote:
> > On Jul 11, 2022, at 1:33 PM, Chuck Lever III <chuck.lever@xxxxxxxxxx> wrote:
> >
> >
> >
> >> On Jul 11, 2022, at 11:01 AM, Chuck Lever III <chuck.lever@xxxxxxxxxx> wrote:
> >>
> >>
> >>
> >>> On Jul 10, 2022, at 6:10 PM, Rick Macklem <rmacklem@xxxxxxxxxxx> wrote:
> >>>
> >>> Hi,
> >>>
> >>> I have been trying to improve the behaviour of the FreeBSD
> >>> NFSv4.1/4.2 client when using the "intr" mount option.
> >>>
> >>> I have come up with the following scheme:
> >>> - When RPCs are interrupted, mark the session slot as potentially bad.
> >>> - When all session slots are marked potentially bad, do a
> >>> DestroySession (only op in RPC) to destroy the session.
> >>> - When the server replies NFS4ERR_BAD_SESSION,
> >>> do a CreateSession (only op in RPC) to acquire a new session and
> >>> continue on.
> >>>
> >>> When testing against a Linux 5.15 server, the CreateSession
> >>> succeeds, but returns the same sessionid as the old session.
> >>> Then all subsequent RPCs get the NFS4ERR_BAD_SESSION reply.
> >>> (The client repeatedly does CreateSession RPCs that reply NFS_OK,
> >>> but always with the same sessionid as the destroyed one.)
> >>>
> >>> Here's what I see in the packet trace:
> >>> (everything works normally until all session slots are marked
> >>> potentially bad at packet# 14216)
> >>> packet# RPC
> >>> 14216 DestroySession request for sessionid 2725cb62002ed418040...0
> >>> 14302 DestroySession reply NFS_OK
> >>> 14304 Getattr request (using above sessionid)
> >>> 14305 Getattr reply NFS4ERR_BAD_SESSION
> >>> 14306 CreateSession request
> >>> *** Now here is where I see a problem...
> >>> 14307 CreateSession reply NFS_OK with sessionid
> >>> 2725cb62002ed418040...0 (same as above)
> >>> 14308 Getattr request (using above sessionid)
> >>> 14309 Getattr reply NFS4ERR_BAD_SESSION
> >>> - and then this just repeats...
> >>> The whole packet trace can be found here, in case you are interested:
> >>> https://people.freebsd.org/~rmacklem/linux.pcap
> >>>
> >>> It seems to me that a successful CreateSession should always return
> >>> a new unique sessionid?
> >>
> >> Hi Rick, thanks for the bug report.
> >>
> >> CREATE_SESSION has a built-in reply cache to thwart replay attacks.
> >> It can legitimately return the same sessionid as a previous request.
> >> Granted, DESTROY_SESSION is supposed to wipe that reply cache...
Well, I just re-read the RFC and I don't see anything that says DestroySession
affects the CreateSession reply cache.

I had thought it did, but I was wrong. See below...

> >>
> >> I'd like to see if there's a test in pynfs that replicates or is close
> >> to the series of operations in your trace so that I can reproduce on
> >> my lab systems and watch it fail up close.
> >
> > I constructed a pynfs test that does something similar to your
> > reproducer:
> >
> > diff --git a/nfs4.1/server41tests/st_destroy_session.py b/nfs4.1/server41tests/st_destroy_session.py
> > index b8be62582366..014330e7d623 100644
> > --- a/nfs4.1/server41tests/st_destroy_session.py
> > +++ b/nfs4.1/server41tests/st_destroy_session.py
> > @@ -1,12 +1,33 @@
> > from .st_create_session import create_session
> > from xdrdef.nfs4_const import *
> > -from .environment import check, fail, create_file, open_file
> > +from .environment import check, fail, create_file, open_file, close_file
> > from xdrdef.nfs4_type import open_owner4, openflag4, createhow4, open_claim4
> > import nfs_ops
> > op = nfs_ops.NFS4ops()
> > import threading
> > import rpc.rpc as rpc
> >
> > +def testDestroyBasic(t, env):
> > + """Ensure operations outside a session fail with BADSESSION
> > +
> > + FLAGS: destroy_session all
> > + CODE: DSESS1
> > + """
> > + c = env.c1.new_client(env.testname(t))
> > + sess1 = c.create_session()
> > + sess1.compound([op.reclaim_complete(FALSE)])
> > + res = c.c.compound([op.destroy_session(sess1.sessionid)])
> > + res = create_file(sess1, env.testname(t),
> > + access=OPEN4_SHARE_ACCESS_READ)
> > + check(res, NFS4ERR_BADSESSION)
> > + sess2 = c.create_session()
> > + res = create_file(sess2, env.testname(t),
> > + access=OPEN4_SHARE_ACCESS_READ)
> > + check(res)
> > + fh = res.resarray[-1].object
> > + open_stateid = res.resarray[-2].stateid
> > + close_file(sess2, fh, stateid=open_stateid)
> > +
> > def testDestroy(t, env):
> > """
> > - create a session
> >
> > I'm not able to reproduce the problem on 5.19-rc5, but that
> > probably means there's something going on that we haven't
> > discovered yet.
Nope, your guess below was correct. It is a FreeBSD client bug.

> My guess is that your client is sending CREATE_SESSION operations
> with the same sequence ID (1) and that is hitting in NFSD's
> CREATE_SESSION reply cache. So it's treating the client's new
> requests as replays and returning an old (stale) sessionid.
Yep. For some reason, I thought that a DestroySeseesion
put the sequence# back.

On re-reading the RFC, there is nothing mentioning that.

Now that I've patched the client to increment the sequence#,
it works fine against the Linux server.

Sorry about the noise (hopefully your pynfs test is still useful).

Thanks for the help, rick
ps: Now I have to fix a deficiency in the FreeBSD server. Right now,
     it only checks the sequence# for an unconfirmed ClientID, but
     that is obviously a FreeBSD bug, too.


--
Chuck Lever








[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux