Re: Handling of BADSESSON error

Olga Kornievskaia <aglo@xxxxxxxxx> · Wed, 14 Jun 2023 16:43:56 -0400



On Wed, Jun 14, 2023 at 4:43 PM Olga Kornievskaia <aglo@xxxxxxxxx> wrote:
>
> On Wed, Jun 14, 2023 at 4:24 PM Rick Macklem <rick.macklem@xxxxxxxxx> wrote:
> >
> > On Wed, Jun 14, 2023 at 12:58 PM Olga Kornievskaia <aglo@xxxxxxxxx> wrote:
> > >
> > >
> > > Hi Trond,
> > >
> > > I'm looking for advice on how to handle the problem that when
> > > BADSESSION is received (on an interrupted slot) and we don't increment
> > > the seqid for that slot. The client releases the slot and it's
> > > possible for another thread to use it before the session is frozen.
> > > Here are the (unfiltered sequential) tracepoints showing the problem.
> > > Follow slot_nr=0 and seq_nr=7673
> > >
> > >    kworker/u2:26-541     [000] .....   869.508658: nfs4_sequence_done:
> > > error=-10052 (BADSESSION) session=0x90caa481 slot_nr=4 seq_nr=4259
> > > highest_slotid=0 target_highest_slotid=0 status_flags=0x0 ()
> > >    kworker/u2:26-541     [000] .....   869.508661: nfs4_write:
> > > error=-10052 (BADSESSION) fileid=00:3b:111 fhandle=0x59c8ccff
> > > offset=2304664 count=7992 res=0 stateid=1:0x3f4f04cd
> > > layoutstateid=0:0x00000000
> > >     kworker/u2:1-3198    [000] .....   869.508898: nfs4_xdr_status:
> > > task:0000a2ae@00000011 xid=0x5d0f6dda error=-10052 (BADSESSION)
> > > operation=53
> > >     kworker/u2:1-3198    [000] .....   869.508905: nfs4_sequence_done:
> > > error=-10052 (BADSESSION) session=0x90caa481 slot_nr=0 seq_nr=7673
> > > highest_slotid=0 target_highest_slotid=0 status_flags=0x0 ()
> > >               dt-3684    [000] .....   869.508918: nfs4_set_lock:
> > > error=-10052 (BADSESSION) cmd=SETLK:WRLCK range=1603340:1834535
> > > fileid=00:3b:109 fhandle=0x7c6bc6b4 stateid=1:0x8f5f1fe4
> > > lockstateid=0:0x7bd5c66f
> > >
> > > *** this is use of slot_nr=0 seq_nr=7673 that gets BADSESSION. Slot
> > > gets released without incrementing the seq#. The next tracepoint shows
> > > the use of the slot again by another lock call ***
> > >
> > >     kworker/u2:1-3198    [000] .....   869.508928:
> > > nfs4_setup_sequence: session=0x90caa481 slot_nr=0 seq_nr=7673
> > > highest_used_slotid=1
> > >    kworker/u2:29-549     [000] .....   869.509746: nfs4_sequence_done:
> > > error=0 (OK) session=0x90caa481 slot_nr=0 seq_nr=7673
> > > highest_slotid=63 target_highest_slotid=63 status_flags=0x0 ()
> > >               dt-3672    [000] .....   869.509770: nfs4_set_lock:
> > > error=0 (OK) cmd=SETLK:WRLCK range=146432:159743 fileid=00:3b:129
> > > fhandle=0x50fa2dd4 stateid=1:0xcf065b31 lockstateid=1:0x5c571804
> > >    kworker/u2:26-541     [000] .....   869.509814:
> > > nfs4_setup_sequence: session=0x90caa481 slot_nr=0 seq_nr=7674
> > > highest_used_slotid=0
> > >    kworker/u2:26-541     [000] .....   869.509857:
> > > nfs4_setup_sequence: session=0x90caa481 slot_nr=1 seq_nr=7805
> > > highest_used_slotid=1
> > >
> > > ** finally the state manager gets to run? But only after 3 "NEW" use
> > > of slots are done **
> > >
> > >  172.28.68.180-m-3751    [000] .....   869.510267: nfs4_state_mgr:
> > > hostname=172.28.68.180 clp state=MANAGER_RUNNING|CHECK_LEASE|0xc040
> > >    kworker/u2:29-549     [000] .....   869.510977: nfs4_xdr_status:
> > > task:0000a2c8@00000011 xid=0x5e0f6dda error=-10052 (BADSESSION)
> > > operation=53
> > >    kworker/u2:29-549     [000] .....   869.510983: nfs4_sequence_done:
> > > error=-10052 (BADSESSION) session=0x90caa481 slot_nr=1 seq_nr=7805
> > > highest_slotid=0 target_highest_slotid=0 status_flags=0x0 ()
> > >    kworker/u2:29-549     [000] .....   869.510985: nfs4_write:
> > > error=-10052 (BADSESSION) fileid=00:3b:129 fhandle=0x50fa2dd4
> > > offset=146432 count=13312 res=0 stateid=1:0xcf065b31
> > > layoutstateid=0:0x00000000
> > >    kworker/u2:26-541     [000] .....   869.511318: nfs4_sequence_done:
> > > error=0 (OK) session=0x90caa481 slot_nr=0 seq_nr=7674
> > > highest_slotid=63 target_highest_slotid=63 status_flags=0x0 ()
> > >               dt-3669    [000] .....   869.511337: nfs4_set_lock:
> > > error=0 (OK) cmd=SETLK:WRLCK range=2462720:2469375 fileid=00:3b:138
> > > fhandle=0xe30d8cf3 stateid=1:0xe2787aa1 lockstateid=1:0x216421fe
> > >  172.28.68.180-m-3751    [000] .....   869.511918:
> > > nfs4_destroy_session: error=0 (OK) dstaddr=172.28.68.180
> > >  172.28.68.180-m-3751    [000] .....   869.513347:
> > > nfs4_create_session: error=0 (OK) dstaddr=172.28.68.180
> > >
> > > To prevent reuse of the same slot/seqid for when we receive
> > > BADSESSION, can we perhaps set slot->seq_done? Then, when
> > > nfs41_sequence_process() calls nfs41_sequence_free_slot(), it'd
> > > increment seq_nr then. Slot re-use would be prevented.
> > >
> > > Or, perhaps we set the NFS4_SLOT_TBL_DRAINING bit right in
> > > nfs41_sequence_process() for BADSESSION so that nothing else can get
> > > the slot when it's released?
> > >
> > > Or some other way or preventing slots being (re)used after receiving
> > > BADSESSION on that slot. The problem if re-using (interrupted) slots
> > > is that they get cached reply from the server and those operations
> > > "think" operation succeeded and they have wrong/invalid stateids for
> > > instance.
> > >
> > > Here's the sequence of events. First of all this is a session trunking
> > > scenario where one of the servers leaves the group.
> > > NFS OP uses slot=0 seq=0 sends it to server 1. Server 1 processes the
> > > request populates its session cache. But the reply never reaches the
> > > client. Connection gets reset.
> > > NFS OP is resent using slot=0 seq=0 to server 2 which just left the
> > > trunking group. It replies with BADSESSION
> > > (session is not frozen on the client yet) new NFS OP uses slot=0 seq=0
> > > and sends it to server 1. Server 1 responds out of the session cache.
> > To me, this sounds like a broken NFSv4.1/4.2 server. Once a session is bad,
> > I do not think there should ever be a reply that was cached in that bad session.
> > Put another way, the server should not leave the "trunking group' (whatever that
> > means?) without making the session bad for all trunks. I do not think
> > a session should
> > ever work on one server and not on another one.
>
> The spec allows for server to leave the group and session to be still valid.
>
> From section 2.10.13.1.4
> "If the SEQUENCE requests fail with NFS4ERR_BADSESSION, then the
> session no longer exists on any of the server network addresses for
> which the client has connections associated with that session ID. It
> is possible the session is still alive and available on other network
> addresses. "
>
> Linux server

I meant to say Linux client

> throws away the session on getting the BADSESSION but the
> problem is that it doesn't happen instantly. So some new requests can
> sneak before the session gets killed. Thus I'm advocating that slotid
> still happens on BADSESSION or I'm suggesting that we freeze the
> session table on BADSESSION which we currently don't do -- which
> allows new requests to go.
>
> >
> > Having said the above, I have no opinion w.r.t. how such a case should
> > be handled.
> > (Except to tell the NFS server vendor that their server is broken.)
> >
> > Just mho, rick
> >
> > > Client destroys the session
> > > Client uses stateid returned from the new OP which is really invalid
> > > for the operation. Server fails the operation. Application failure
> > > occurs.
> > >
> > > Thank you..