Re: Handling of BADSESSON error

Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> · Wed, 14 Jun 2023 21:43:49 +0000

On Wed, 2023-06-14 at 15:58 -0400, Olga Kornievskaia wrote:
> Hi Trond,
> 
> I'm looking for advice on how to handle the problem that when
> BADSESSION is received (on an interrupted slot) and we don't
> increment
> the seqid for that slot. The client releases the slot and it's
> possible for another thread to use it before the session is frozen.
> Here are the (unfiltered sequential) tracepoints showing the problem.
> Follow slot_nr=0 and seq_nr=7673
> 
>    kworker/u2:26-541     [000] .....   869.508658:
> nfs4_sequence_done:
> error=-10052 (BADSESSION) session=0x90caa481 slot_nr=4 seq_nr=4259
> highest_slotid=0 target_highest_slotid=0 status_flags=0x0 ()
>    kworker/u2:26-541     [000] .....   869.508661: nfs4_write:
> error=-10052 (BADSESSION) fileid=00:3b:111 fhandle=0x59c8ccff
> offset=2304664 count=7992 res=0 stateid=1:0x3f4f04cd
> layoutstateid=0:0x00000000
>     kworker/u2:1-3198    [000] .....   869.508898: nfs4_xdr_status:
> task:0000a2ae@00000011 xid=0x5d0f6dda error=-10052 (BADSESSION)
> operation=53
>     kworker/u2:1-3198    [000] .....   869.508905:
> nfs4_sequence_done:
> error=-10052 (BADSESSION) session=0x90caa481 slot_nr=0 seq_nr=7673
> highest_slotid=0 target_highest_slotid=0 status_flags=0x0 ()
>               dt-3684    [000] .....   869.508918: nfs4_set_lock:
> error=-10052 (BADSESSION) cmd=SETLK:WRLCK range=1603340:1834535
> fileid=00:3b:109 fhandle=0x7c6bc6b4 stateid=1:0x8f5f1fe4
> lockstateid=0:0x7bd5c66f
> 
> *** this is use of slot_nr=0 seq_nr=7673 that gets BADSESSION. Slot
> gets released without incrementing the seq#. The next tracepoint
> shows
> the use of the slot again by another lock call ***
> 
>     kworker/u2:1-3198    [000] .....   869.508928:
> nfs4_setup_sequence: session=0x90caa481 slot_nr=0 seq_nr=7673
> highest_used_slotid=1
>    kworker/u2:29-549     [000] .....   869.509746:
> nfs4_sequence_done:
> error=0 (OK) session=0x90caa481 slot_nr=0 seq_nr=7673
> highest_slotid=63 target_highest_slotid=63 status_flags=0x0 ()
>               dt-3672    [000] .....   869.509770: nfs4_set_lock:
> error=0 (OK) cmd=SETLK:WRLCK range=146432:159743 fileid=00:3b:129
> fhandle=0x50fa2dd4 stateid=1:0xcf065b31 lockstateid=1:0x5c571804
>    kworker/u2:26-541     [000] .....   869.509814:
> nfs4_setup_sequence: session=0x90caa481 slot_nr=0 seq_nr=7674
> highest_used_slotid=0
>    kworker/u2:26-541     [000] .....   869.509857:
> nfs4_setup_sequence: session=0x90caa481 slot_nr=1 seq_nr=7805
> highest_used_slotid=1
> 
> ** finally the state manager gets to run? But only after 3 "NEW" use
> of slots are done **
> 
>  172.28.68.180-m-3751    [000] .....   869.510267: nfs4_state_mgr:
> hostname=172.28.68.180 clp state=MANAGER_RUNNING|CHECK_LEASE|0xc040
>    kworker/u2:29-549     [000] .....   869.510977: nfs4_xdr_status:
> task:0000a2c8@00000011 xid=0x5e0f6dda error=-10052 (BADSESSION)
> operation=53
>    kworker/u2:29-549     [000] .....   869.510983:
> nfs4_sequence_done:
> error=-10052 (BADSESSION) session=0x90caa481 slot_nr=1 seq_nr=7805
> highest_slotid=0 target_highest_slotid=0 status_flags=0x0 ()
>    kworker/u2:29-549     [000] .....   869.510985: nfs4_write:
> error=-10052 (BADSESSION) fileid=00:3b:129 fhandle=0x50fa2dd4
> offset=146432 count=13312 res=0 stateid=1:0xcf065b31
> layoutstateid=0:0x00000000
>    kworker/u2:26-541     [000] .....   869.511318:
> nfs4_sequence_done:
> error=0 (OK) session=0x90caa481 slot_nr=0 seq_nr=7674
> highest_slotid=63 target_highest_slotid=63 status_flags=0x0 ()
>               dt-3669    [000] .....   869.511337: nfs4_set_lock:
> error=0 (OK) cmd=SETLK:WRLCK range=2462720:2469375 fileid=00:3b:138
> fhandle=0xe30d8cf3 stateid=1:0xe2787aa1 lockstateid=1:0x216421fe
>  172.28.68.180-m-3751    [000] .....   869.511918:
> nfs4_destroy_session: error=0 (OK) dstaddr=172.28.68.180
>  172.28.68.180-m-3751    [000] .....   869.513347:
> nfs4_create_session: error=0 (OK) dstaddr=172.28.68.180
> 
> To prevent reuse of the same slot/seqid for when we receive
> BADSESSION, can we perhaps set slot->seq_done? Then, when
> nfs41_sequence_process() calls nfs41_sequence_free_slot(), it'd
> increment seq_nr then. Slot re-use would be prevented.
> 
> Or, perhaps we set the NFS4_SLOT_TBL_DRAINING bit right in
> nfs41_sequence_process() for BADSESSION so that nothing else can get
> the slot when it's released?
> 
> Or some other way or preventing slots being (re)used after receiving
> BADSESSION on that slot. The problem if re-using (interrupted) slots
> is that they get cached reply from the server and those operations
> "think" operation succeeded and they have wrong/invalid stateids for
> instance.
> 
> Here's the sequence of events. First of all this is a session
> trunking
> scenario where one of the servers leaves the group.
> NFS OP uses slot=0 seq=0 sends it to server 1. Server 1 processes the
> request populates its session cache. But the reply never reaches the
> client. Connection gets reset.
> NFS OP is resent using slot=0 seq=0 to server 2 which just left the
> trunking group. It replies with BADSESSION
> (session is not frozen on the client yet) new NFS OP uses slot=0
> seq=0
> and sends it to server 1. Server 1 responds out of the session cache.
> Client destroys the session
> Client uses stateid returned from the new OP which is really invalid
> for the operation. Server fails the operation. Application failure
> occurs.
> 
> Thank you..

I suggest just adding a call along the lines of

	set_bit(NFS4_SLOT_TBL_DRAINING, &session->fc_slot_table.slot_tbl_state);

immediately before the call to nfs4_schedule_session_recovery() in
nfs41_sequence_process(). That ought to be race-free because we should
still be holding the slot. It won't try to do any of the other fancy
stuff in nfs4_drain_slot_tbl(). All that will happen is that
nfs4_setup_sequence() will stop allocating new unprivileged slots.

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@xxxxxxxxxxxxxxx