Re: Handling of BADSESSON error

Rick Macklem <rick.macklem@xxxxxxxxx> · Wed, 14 Jun 2023 13:24:05 -0700

On Wed, Jun 14, 2023 at 12:58 PM Olga Kornievskaia <aglo@xxxxxxxxx> wrote:
>
>
> Hi Trond,
>
> I'm looking for advice on how to handle the problem that when
> BADSESSION is received (on an interrupted slot) and we don't increment
> the seqid for that slot. The client releases the slot and it's
> possible for another thread to use it before the session is frozen.
> Here are the (unfiltered sequential) tracepoints showing the problem.
> Follow slot_nr=0 and seq_nr=7673
>
>    kworker/u2:26-541     [000] .....   869.508658: nfs4_sequence_done:
> error=-10052 (BADSESSION) session=0x90caa481 slot_nr=4 seq_nr=4259
> highest_slotid=0 target_highest_slotid=0 status_flags=0x0 ()
>    kworker/u2:26-541     [000] .....   869.508661: nfs4_write:
> error=-10052 (BADSESSION) fileid=00:3b:111 fhandle=0x59c8ccff
> offset=2304664 count=7992 res=0 stateid=1:0x3f4f04cd
> layoutstateid=0:0x00000000
>     kworker/u2:1-3198    [000] .....   869.508898: nfs4_xdr_status:
> task:0000a2ae@00000011 xid=0x5d0f6dda error=-10052 (BADSESSION)
> operation=53
>     kworker/u2:1-3198    [000] .....   869.508905: nfs4_sequence_done:
> error=-10052 (BADSESSION) session=0x90caa481 slot_nr=0 seq_nr=7673
> highest_slotid=0 target_highest_slotid=0 status_flags=0x0 ()
>               dt-3684    [000] .....   869.508918: nfs4_set_lock:
> error=-10052 (BADSESSION) cmd=SETLK:WRLCK range=1603340:1834535
> fileid=00:3b:109 fhandle=0x7c6bc6b4 stateid=1:0x8f5f1fe4
> lockstateid=0:0x7bd5c66f
>
> *** this is use of slot_nr=0 seq_nr=7673 that gets BADSESSION. Slot
> gets released without incrementing the seq#. The next tracepoint shows
> the use of the slot again by another lock call ***
>
>     kworker/u2:1-3198    [000] .....   869.508928:
> nfs4_setup_sequence: session=0x90caa481 slot_nr=0 seq_nr=7673
> highest_used_slotid=1
>    kworker/u2:29-549     [000] .....   869.509746: nfs4_sequence_done:
> error=0 (OK) session=0x90caa481 slot_nr=0 seq_nr=7673
> highest_slotid=63 target_highest_slotid=63 status_flags=0x0 ()
>               dt-3672    [000] .....   869.509770: nfs4_set_lock:
> error=0 (OK) cmd=SETLK:WRLCK range=146432:159743 fileid=00:3b:129
> fhandle=0x50fa2dd4 stateid=1:0xcf065b31 lockstateid=1:0x5c571804
>    kworker/u2:26-541     [000] .....   869.509814:
> nfs4_setup_sequence: session=0x90caa481 slot_nr=0 seq_nr=7674
> highest_used_slotid=0
>    kworker/u2:26-541     [000] .....   869.509857:
> nfs4_setup_sequence: session=0x90caa481 slot_nr=1 seq_nr=7805
> highest_used_slotid=1
>
> ** finally the state manager gets to run? But only after 3 "NEW" use
> of slots are done **
>
>  172.28.68.180-m-3751    [000] .....   869.510267: nfs4_state_mgr:
> hostname=172.28.68.180 clp state=MANAGER_RUNNING|CHECK_LEASE|0xc040
>    kworker/u2:29-549     [000] .....   869.510977: nfs4_xdr_status:
> task:0000a2c8@00000011 xid=0x5e0f6dda error=-10052 (BADSESSION)
> operation=53
>    kworker/u2:29-549     [000] .....   869.510983: nfs4_sequence_done:
> error=-10052 (BADSESSION) session=0x90caa481 slot_nr=1 seq_nr=7805
> highest_slotid=0 target_highest_slotid=0 status_flags=0x0 ()
>    kworker/u2:29-549     [000] .....   869.510985: nfs4_write:
> error=-10052 (BADSESSION) fileid=00:3b:129 fhandle=0x50fa2dd4
> offset=146432 count=13312 res=0 stateid=1:0xcf065b31
> layoutstateid=0:0x00000000
>    kworker/u2:26-541     [000] .....   869.511318: nfs4_sequence_done:
> error=0 (OK) session=0x90caa481 slot_nr=0 seq_nr=7674
> highest_slotid=63 target_highest_slotid=63 status_flags=0x0 ()
>               dt-3669    [000] .....   869.511337: nfs4_set_lock:
> error=0 (OK) cmd=SETLK:WRLCK range=2462720:2469375 fileid=00:3b:138
> fhandle=0xe30d8cf3 stateid=1:0xe2787aa1 lockstateid=1:0x216421fe
>  172.28.68.180-m-3751    [000] .....   869.511918:
> nfs4_destroy_session: error=0 (OK) dstaddr=172.28.68.180
>  172.28.68.180-m-3751    [000] .....   869.513347:
> nfs4_create_session: error=0 (OK) dstaddr=172.28.68.180
>
> To prevent reuse of the same slot/seqid for when we receive
> BADSESSION, can we perhaps set slot->seq_done? Then, when
> nfs41_sequence_process() calls nfs41_sequence_free_slot(), it'd
> increment seq_nr then. Slot re-use would be prevented.
>
> Or, perhaps we set the NFS4_SLOT_TBL_DRAINING bit right in
> nfs41_sequence_process() for BADSESSION so that nothing else can get
> the slot when it's released?
>
> Or some other way or preventing slots being (re)used after receiving
> BADSESSION on that slot. The problem if re-using (interrupted) slots
> is that they get cached reply from the server and those operations
> "think" operation succeeded and they have wrong/invalid stateids for
> instance.
>
> Here's the sequence of events. First of all this is a session trunking
> scenario where one of the servers leaves the group.
> NFS OP uses slot=0 seq=0 sends it to server 1. Server 1 processes the
> request populates its session cache. But the reply never reaches the
> client. Connection gets reset.
> NFS OP is resent using slot=0 seq=0 to server 2 which just left the
> trunking group. It replies with BADSESSION
> (session is not frozen on the client yet) new NFS OP uses slot=0 seq=0
> and sends it to server 1. Server 1 responds out of the session cache.
To me, this sounds like a broken NFSv4.1/4.2 server. Once a session is bad,
I do not think there should ever be a reply that was cached in that bad session.
Put another way, the server should not leave the "trunking group' (whatever that
means?) without making the session bad for all trunks. I do not think
a session should
ever work on one server and not on another one.

Having said the above, I have no opinion w.r.t. how such a case should
be handled.
(Except to tell the NFS server vendor that their server is broken.)

Just mho, rick

> Client destroys the session
> Client uses stateid returned from the new OP which is really invalid
> for the operation. Server fails the operation. Application failure
> occurs.
>
> Thank you..