Hi folks, I don't like to bring bad news, but yesterday I had a problem with kernel 6.12.9 (twice), see attachment. AFAIK 6.12.9 is supposed to include 961b4b5e86bf56a2e4b567f81682defa5cba957e and 8626664c87eebb21a40d4924b2f244a1f56d8806. Regards Harri ________________________________________ From: Salvatore Bonaccorso <salvatore.bonaccorso@xxxxxxxxx> on behalf of Salvatore Bonaccorso <carnil@xxxxxxxxxx> Sent: Friday, February 21, 2025 14:42 To: Baptiste PELLEGRIN via Bugspray Bot Cc: anna@xxxxxxxxxx; jlayton@xxxxxxxxxx; cel@xxxxxxxxxx; herzog@xxxxxxxxxxxx; tom@xxxxxxxxxx; trondmy@xxxxxxxxxx; benoit.gschwind@xxxxxxxxxxxxxxxxx; baptiste.pellegrin@xxxxxxxxxxxxxx; linux-nfs@xxxxxxxxxxxxxxx; Harald Dunkel; chuck.lever@xxxxxxxxxx Subject: Re: NFSD threads hang when destroying a session or client ID Hi, On Mon, Feb 10, 2025 at 12:05:32PM +0000, Baptiste PELLEGRIN via Bugspray Bot wrote: > Baptiste PELLEGRIN writes via Kernel.org Bugzilla: > > Hello. > > Good news for 6.1 kernels. > > With these patches applied : > > 8626664c87ee NFSD: Replace dprintks in nfsd4_cb_sequence_done() > 961b4b5e86bf NFSD: Reset cb_seq_status after NFS4ERR_DELAY > > No hangs anymore for me since more than two weeks of server uptime. And previously the hangs occurred every weeks. > > I just see some suspicious server load maybe caused by the send of RPC_RECALL_ANY to shutdown/suspended/courtesy clients. > > I see a lot of work on the list that try to address these problems : > > nfsd: eliminate special handling of NFS4ERR_SEQ_MISORDERED > nfsd: handle NFS4ERR_BADSLOT on CB_SEQUENCE better > nfsd: when CB_SEQUENCE gets ESERVERFAULT don't increment seq_nr > nfsd: only check RPC_SIGNALLED() when restarting rpc_task > nfsd: always release slot when requeueing callback > nfsd: lift NFSv4.0 handling out of nfsd4_cb_sequence_done() > nfsd: prepare nfsd4_cb_sequence_done() for error handling rework > > NFSD: Skip sending CB_RECALL_ANY when the backchannel isn't up > > NFSD: fix hang in nfsd4_shutdown_callback So I see the backport of 961b4b5e86bf NFSD: Reset cb_seq_status after NFS4ERR_DELAY landed in the just released 6.1.129 stable version. Do we consider this to be sufficient to have stabilized the situation about this issue? (I do realize much other work has dne as well which partially has flown down to stable series already). This reply is mainly in focus of https://bugs.debian.org/1071562 Regards, Salvatore District Court Aachen - HRB 8057 Management Board: Arnaud Picut (CEO), Hicham El Bonne (CTO) Chairman of the Supervisory Board: Benjamin Carl Lucas
Attachment:
x2
Description: x2