Re: NFSD threads hang when destroying a session or client ID

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi folks,

I don't like to bring bad news, but yesterday I had a problem with kernel 6.12.9 (twice), see attachment. AFAIK 6.12.9 is supposed to include 961b4b5e86bf56a2e4b567f81682defa5cba957e and 8626664c87eebb21a40d4924b2f244a1f56d8806.

Regards
Harri

________________________________________
From: Salvatore Bonaccorso <salvatore.bonaccorso@xxxxxxxxx> on behalf of Salvatore Bonaccorso <carnil@xxxxxxxxxx>
Sent: Friday, February 21, 2025 14:42
To: Baptiste PELLEGRIN via Bugspray Bot
Cc: anna@xxxxxxxxxx; jlayton@xxxxxxxxxx; cel@xxxxxxxxxx; herzog@xxxxxxxxxxxx; tom@xxxxxxxxxx; trondmy@xxxxxxxxxx; benoit.gschwind@xxxxxxxxxxxxxxxxx; baptiste.pellegrin@xxxxxxxxxxxxxx; linux-nfs@xxxxxxxxxxxxxxx; Harald Dunkel; chuck.lever@xxxxxxxxxx
Subject: Re: NFSD threads hang when destroying a session or client ID

Hi,

On Mon, Feb 10, 2025 at 12:05:32PM +0000, Baptiste PELLEGRIN via Bugspray Bot wrote:
> Baptiste PELLEGRIN writes via Kernel.org Bugzilla:
>
> Hello.
>
> Good news for 6.1 kernels.
>
> With these patches applied :
>
> 8626664c87ee NFSD: Replace dprintks in nfsd4_cb_sequence_done()
> 961b4b5e86bf NFSD: Reset cb_seq_status after NFS4ERR_DELAY
>
> No hangs anymore for me since more than two weeks of server uptime. And previously the hangs occurred every weeks.
>
> I just see some suspicious server load maybe caused by the send of RPC_RECALL_ANY to shutdown/suspended/courtesy clients.
>
> I see a lot of work on the list that try to address these problems :
>
> nfsd: eliminate special handling of NFS4ERR_SEQ_MISORDERED
> nfsd: handle NFS4ERR_BADSLOT on CB_SEQUENCE better
> nfsd: when CB_SEQUENCE gets ESERVERFAULT don't increment seq_nr
> nfsd: only check RPC_SIGNALLED() when restarting rpc_task
> nfsd: always release slot when requeueing callback
> nfsd: lift NFSv4.0 handling out of nfsd4_cb_sequence_done()
> nfsd: prepare nfsd4_cb_sequence_done() for error handling rework
>
> NFSD: Skip sending CB_RECALL_ANY when the backchannel isn't up
>
> NFSD: fix hang in nfsd4_shutdown_callback

So I see the backport of 961b4b5e86bf NFSD: Reset cb_seq_status after
NFS4ERR_DELAY landed in the just released 6.1.129 stable version.

Do we consider this to be sufficient to have stabilized the situation
about this issue? (I do  realize much other work has dne as well which
partially has flown down to stable series already).

This reply is mainly in focus of https://bugs.debian.org/1071562

Regards,
Salvatore
District Court Aachen - HRB 8057
Management Board: Arnaud Picut (CEO), Hicham El Bonne (CTO)
Chairman of the Supervisory Board: Benjamin Carl Lucas

Attachment: x2
Description: x2


[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux