On Wed, 2023-06-14 at 15:58 -0400, Olga Kornievskaia wrote: > Hi Trond, > > I'm looking for advice on how to handle the problem that when > BADSESSION is received (on an interrupted slot) and we don't > increment > the seqid for that slot. The client releases the slot and it's > possible for another thread to use it before the session is frozen. > Here are the (unfiltered sequential) tracepoints showing the problem. > Follow slot_nr=0 and seq_nr=7673 > > kworker/u2:26-541 [000] ..... 869.508658: > nfs4_sequence_done: > error=-10052 (BADSESSION) session=0x90caa481 slot_nr=4 seq_nr=4259 > highest_slotid=0 target_highest_slotid=0 status_flags=0x0 () > kworker/u2:26-541 [000] ..... 869.508661: nfs4_write: > error=-10052 (BADSESSION) fileid=00:3b:111 fhandle=0x59c8ccff > offset=2304664 count=7992 res=0 stateid=1:0x3f4f04cd > layoutstateid=0:0x00000000 > kworker/u2:1-3198 [000] ..... 869.508898: nfs4_xdr_status: > task:0000a2ae@00000011 xid=0x5d0f6dda error=-10052 (BADSESSION) > operation=53 > kworker/u2:1-3198 [000] ..... 869.508905: > nfs4_sequence_done: > error=-10052 (BADSESSION) session=0x90caa481 slot_nr=0 seq_nr=7673 > highest_slotid=0 target_highest_slotid=0 status_flags=0x0 () > dt-3684 [000] ..... 869.508918: nfs4_set_lock: > error=-10052 (BADSESSION) cmd=SETLK:WRLCK range=1603340:1834535 > fileid=00:3b:109 fhandle=0x7c6bc6b4 stateid=1:0x8f5f1fe4 > lockstateid=0:0x7bd5c66f > > *** this is use of slot_nr=0 seq_nr=7673 that gets BADSESSION. Slot > gets released without incrementing the seq#. The next tracepoint > shows > the use of the slot again by another lock call *** > > kworker/u2:1-3198 [000] ..... 869.508928: > nfs4_setup_sequence: session=0x90caa481 slot_nr=0 seq_nr=7673 > highest_used_slotid=1 > kworker/u2:29-549 [000] ..... 869.509746: > nfs4_sequence_done: > error=0 (OK) session=0x90caa481 slot_nr=0 seq_nr=7673 > highest_slotid=63 target_highest_slotid=63 status_flags=0x0 () > dt-3672 [000] ..... 869.509770: nfs4_set_lock: > error=0 (OK) cmd=SETLK:WRLCK range=146432:159743 fileid=00:3b:129 > fhandle=0x50fa2dd4 stateid=1:0xcf065b31 lockstateid=1:0x5c571804 > kworker/u2:26-541 [000] ..... 869.509814: > nfs4_setup_sequence: session=0x90caa481 slot_nr=0 seq_nr=7674 > highest_used_slotid=0 > kworker/u2:26-541 [000] ..... 869.509857: > nfs4_setup_sequence: session=0x90caa481 slot_nr=1 seq_nr=7805 > highest_used_slotid=1 > > ** finally the state manager gets to run? But only after 3 "NEW" use > of slots are done ** > > 172.28.68.180-m-3751 [000] ..... 869.510267: nfs4_state_mgr: > hostname=172.28.68.180 clp state=MANAGER_RUNNING|CHECK_LEASE|0xc040 > kworker/u2:29-549 [000] ..... 869.510977: nfs4_xdr_status: > task:0000a2c8@00000011 xid=0x5e0f6dda error=-10052 (BADSESSION) > operation=53 > kworker/u2:29-549 [000] ..... 869.510983: > nfs4_sequence_done: > error=-10052 (BADSESSION) session=0x90caa481 slot_nr=1 seq_nr=7805 > highest_slotid=0 target_highest_slotid=0 status_flags=0x0 () > kworker/u2:29-549 [000] ..... 869.510985: nfs4_write: > error=-10052 (BADSESSION) fileid=00:3b:129 fhandle=0x50fa2dd4 > offset=146432 count=13312 res=0 stateid=1:0xcf065b31 > layoutstateid=0:0x00000000 > kworker/u2:26-541 [000] ..... 869.511318: > nfs4_sequence_done: > error=0 (OK) session=0x90caa481 slot_nr=0 seq_nr=7674 > highest_slotid=63 target_highest_slotid=63 status_flags=0x0 () > dt-3669 [000] ..... 869.511337: nfs4_set_lock: > error=0 (OK) cmd=SETLK:WRLCK range=2462720:2469375 fileid=00:3b:138 > fhandle=0xe30d8cf3 stateid=1:0xe2787aa1 lockstateid=1:0x216421fe > 172.28.68.180-m-3751 [000] ..... 869.511918: > nfs4_destroy_session: error=0 (OK) dstaddr=172.28.68.180 > 172.28.68.180-m-3751 [000] ..... 869.513347: > nfs4_create_session: error=0 (OK) dstaddr=172.28.68.180 > > To prevent reuse of the same slot/seqid for when we receive > BADSESSION, can we perhaps set slot->seq_done? Then, when > nfs41_sequence_process() calls nfs41_sequence_free_slot(), it'd > increment seq_nr then. Slot re-use would be prevented. > > Or, perhaps we set the NFS4_SLOT_TBL_DRAINING bit right in > nfs41_sequence_process() for BADSESSION so that nothing else can get > the slot when it's released? > > Or some other way or preventing slots being (re)used after receiving > BADSESSION on that slot. The problem if re-using (interrupted) slots > is that they get cached reply from the server and those operations > "think" operation succeeded and they have wrong/invalid stateids for > instance. > > Here's the sequence of events. First of all this is a session > trunking > scenario where one of the servers leaves the group. > NFS OP uses slot=0 seq=0 sends it to server 1. Server 1 processes the > request populates its session cache. But the reply never reaches the > client. Connection gets reset. > NFS OP is resent using slot=0 seq=0 to server 2 which just left the > trunking group. It replies with BADSESSION > (session is not frozen on the client yet) new NFS OP uses slot=0 > seq=0 > and sends it to server 1. Server 1 responds out of the session cache. > Client destroys the session > Client uses stateid returned from the new OP which is really invalid > for the operation. Server fails the operation. Application failure > occurs. > > Thank you.. I suggest just adding a call along the lines of set_bit(NFS4_SLOT_TBL_DRAINING, &session->fc_slot_table.slot_tbl_state); immediately before the call to nfs4_schedule_session_recovery() in nfs41_sequence_process(). That ought to be race-free because we should still be holding the slot. It won't try to do any of the other fancy stuff in nfs4_drain_slot_tbl(). All that will happen is that nfs4_setup_sequence() will stop allocating new unprivileged slots. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx