Vladimir reported a problem running a certain set of testcases. These tests would spin up a number of QEMU/KVM guests that use nfsroot. After running for a while, they eventually reboot, and that boot hangs mounting the server because the server is continually returning NFS4ERR_DELAY on a CREATE_SESSION operation. With the addition of some new tracepoints that I'll eventually post, I think I have a handle on what's happening: First, server decides to issue a CB_RECALL_ANY to the client, but there is a fault on the callback channel: nfsd-42728 [000] ....... 1678.271093: nfsd_fh_verify: xid=0x9109236c fh_hash=0x4394a635 type=0x0 access=BYPASS_GSS nfsd-42728 [000] ....... 1678.271093: nfsd_compound_status: op=2/3 OP_PUTFH status=0 nfsd-42728 [000] ....... 1678.271093: nfsd_fh_verify: xid=0x9109236c fh_hash=0x4394a635 type=LNK access= nfsd-42728 [000] ....... 1678.271095: nfsd_compound_status: op=3/3 OP_READLINK status=0 kworker/u8:5-459 [002] ....... 1690.685190: nfsd_cb_recall_any: addr=192.168.50.101:833 client 660ffa56:69849ebb keep=0 bmval0=RDATA_DLG kworker/u8:5-459 [002] ....... 1690.685191: nfsd_cb_queue: addr=192.168.50.101:0 client 660ffa56:69849ebb cb=000000006e42ee91 (first try) kworker/u8:5-459 [002] ....... 1690.685194: nfsd_cb_start: addr=192.168.50.101:0 client 660ffa56:69849ebb state=UP kworker/u8:0-42764 [002] ....... 1699.821526: nfsd_cb_seq_status: task:00000001@00000003 sessionid=660ffa56:69849ebb:0000001e:00000000 tk_status=-5 seq_status=1 kworker/u8:0-42764 [002] ....... 1699.821527: nfsd_cb_new_state: addr=192.168.50.101:0 client 660ffa56:69849ebb state=FAULT kworker/u8:0-42764 [002] ....... 1699.821528: nfsd_cb_restart: addr=192.168.50.101:0 client 660ffa56:69849ebb cb=000000006e42ee91 (first try) kworker/u8:0-42764 [002] ....... 1699.821533: nfsd_cb_queue: addr=192.168.50.101:0 client 660ffa56:69849ebb cb=000000006e42ee91 (need restart) kworker/u8:0-42764 [002] ....... 1699.821538: nfsd_cb_start: addr=192.168.50.101:0 client 660ffa56:69849ebb state=FAULT kworker/u8:0-42764 [003] ....... 1709.037319: nfsd_cb_seq_status: task:00000002@00000003 sessionid=660ffa56:69849ebb:0000001e:00000000 tk_status=-5 seq_status=1 kworker/u8:0-42764 [003] ....... 1709.037320: nfsd_cb_restart: addr=192.168.50.101:0 client 660ffa56:69849ebb cb=000000006e42ee91 (first try) kworker/u8:0-42764 [003] ....... 1709.037325: nfsd_cb_queue: addr=192.168.50.101:0 client 660ffa56:69849ebb cb=000000006e42ee91 (need restart) kworker/u8:0-42764 [003] ....... 1709.037327: nfsd_cb_start: addr=192.168.50.101:0 client 660ffa56:69849ebb state=FAULT This repeats for a while over several seconds. Eventually the server tries to probe the callback channel, but it's DOWN. Cue more nfsd_cb_queue/nfsd_cb_start calls repeating over time: kworker/u8:3-42 [003] ....... 1773.550044: nfsd_cb_restart: addr=192.168.50.101:0 client 660ffa56:69849ebb cb=000000006e42ee91 (first try) kworker/u8:3-42 [003] ....... 1773.550049: nfsd_cb_queue: addr=192.168.50.101:0 client 660ffa56:69849ebb cb=000000006e42ee91 (need restart) kworker/u8:3-42 [003] ....... 1773.550053: nfsd_cb_start: addr=192.168.50.101:0 client 660ffa56:69849ebb state=FAULT nfsd-42728 [000] ....1.. 1781.401822: nfsd_cb_lost: addr=192.168.50.101:0 client 660ffa56:69849ebb state=FAULT nfsd-42728 [000] ....2.. 1781.401823: nfsd_cb_probe: addr=192.168.50.101:0 client 660ffa56:69849ebb state=FAULT nfsd-42728 [000] ....2.. 1781.401823: nfsd_cb_new_state: addr=192.168.50.101:0 client 660ffa56:69849ebb state=UNKNOWN nfsd-42728 [000] ....2.. 1781.401824: nfsd_cb_queue: addr=192.168.50.101:0 client 660ffa56:69849ebb cb=00000000c4627b33 (first try) kworker/u8:3-42 [000] ....... 1781.401832: nfsd_cb_start: addr=192.168.50.101:0 client 660ffa56:69849ebb state=UNKNOWN kworker/u8:5-459 [003] ....... 1781.401833: nfsd_cb_seq_status: task:0000000a@00000003 sessionid=660ffa56:69849ebb:0000001e:00000000 tk_status=-107 seq_status=1 kworker/u8:3-42 [000] ....... 1781.401833: nfsd_cb_bc_update: addr=192.168.50.101:0 client 660ffa56:69849ebb cb=00000000c4627b33 (first try) kworker/u8:3-42 [000] ....... 1781.401833: nfsd_cb_bc_shutdown: addr=192.168.50.101:0 client 660ffa56:69849ebb cb=00000000c4627b33 (first try) kworker/u8:5-459 [003] ....... 1781.401833: nfsd_cb_restart: addr=192.168.50.101:0 client 660ffa56:69849ebb cb=000000006e42ee91 (first try) kworker/u8:5-459 [003] ....... 1781.401836: nfsd_cb_queue: addr=192.168.50.101:0 client 660ffa56:69849ebb cb=000000006e42ee91 (need restart) kworker/u8:3-42 [000] ....... 1781.401858: nfsd_cb_new_state: addr=192.168.50.101:0 client 660ffa56:69849ebb state=DOWN kworker/u8:3-42 [000] ....... 1781.401858: nfsd_cb_queue: addr=192.168.50.101:0 client 660ffa56:69849ebb cb=00000000c4627b33 (first try) kworker/u8:3-42 [000] ....... 1781.401859: nfsd_cb_start: addr=192.168.50.101:0 client 660ffa56:69849ebb state=DOWN Eventually, the client calls back (after rebooting), but it can't get through: kworker/u8:3-42 [000] ....... 1782.391951: nfsd_cb_start: addr=192.168.50.101:0 client 660ffa56:69849ebb state=DOWN kworker/u8:3-42 [000] ....... 1782.391951: nfsd_cb_queue: addr=192.168.50.101:0 client 660ffa56:69849ebb cb=00000000c4627b33 (first try) kworker/u8:3-42 [000] ....... 1782.391952: nfsd_cb_start: addr=192.168.50.101:0 client 660ffa56:69849ebb state=DOWN kworker/u8:3-42 [000] ....... 1782.391952: nfsd_cb_queue: addr=192.168.50.101:0 client 660ffa56:69849ebb cb=000000006e42ee91 (need restart) nfsd-42728 [000] ....... 1782.409010: nfsd_compound: xid=0xa299edc2 opcnt=1 tag= nfsd-42728 [000] ....1.. 1782.409039: nfsd_clid_verf_mismatch: client 660ffa56:69849ebb verf=0x17c364ee8596f72a, updated=0x17c3650afe9e0a7f from addr=192.168.50.101:926 nfsd-42728 [000] ....... 1782.409041: nfsd_compound_status: op=1/1 OP_EXCHANGE_ID status=0 nfsd-42728 [000] ....... 1782.409241: nfsd_compound: xid=0xa399edc2 opcnt=1 tag= nfsd-42728 [000] ....1.. 1782.409245: nfsd_clid_verf_mismatch: client 660ffa56:69849ebb verf=0x17c364ee8596f72a, updated=0x17c3650afe9e0a7f from addr=192.168.50.101:926 nfsd-42728 [000] ....... 1782.409245: nfsd_compound_status: op=1/1 OP_EXCHANGE_ID status=0 nfsd-42728 [000] ....... 1782.409309: nfsd_compound: xid=0xa499edc2 opcnt=1 tag= nfsd-42728 [000] ....1.. 1782.409325: check_slot_seqid: seqid=1 slot_seqid=0 inuse=0 conf=0 ret=0 nfsd-42728 [000] ....1.. 1782.409326: mark_client_expired_locked: addr=192.168.50.101:0 client 660ffa56:69849ebb error=10008 nfsd-42728 [000] ....... 1782.409329: nfsd_compound_status: op=1/1 OP_CREATE_SESSION status=10008 kworker/u8:3-42 [000] ....... 1782.417901: nfsd_cb_start: addr=192.168.50.101:0 client 660ffa56:69849ebb state=DOWN kworker/u8:3-42 [000] ....... 1782.417901: nfsd_cb_queue: addr=192.168.50.101:0 client 660ffa56:69849ebb cb=000000006e42ee91 (need restart) The EXCHANGE_ID works, but CREATE_SESSION returns NFS4ERR_DELAY because it couldn't unhash the client due the refcount (cl_rpc_users) being high. I didn't capture the value of the refcount at this point, but the CB_RECALL_ANY job holds a reference to the old client, so that would be enough to prevent unhashing it. I think we need to rework the refcounting on the CB_RECALL_ANY job, at a minimum. It probably shouldn't hold a reference to the client. Maybe make it store the clientid and just redo the lookup if it's still around? I might try to spin up a patch along those lines. Longer term, I think we need to take a hard look at why the cl_rpc_users refcount ensures that the client remains hashed. It seems like we ought to be able to unhash it, even if there are outstanding references, no? Thoughts on other ways to fix this or other structural fixes? -- Jeff Layton <jlayton@xxxxxxxxxx>