Re: nfsd blocks indefinitely in nfsd4_destroy_session

Chuck Lever <chuck.lever@xxxxxxxxxx> · Fri, 17 Jan 2025 15:27:48 -0500

On 1/17/25 2:43 PM, Baptiste PELLEGRIN wrote:

On 1/16/25 9:07 PM, Chuck Lever wrote:
On 1/11/25 3:14 AM, Pellegrin Baptiste wrote:
On 09/01/25 17:32, *Chuck Lever * <chuck.lever@xxxxxxxxxx> wrote:
On 1/9/25 7:42 AM, Jeff Layton wrote:
>On Thu, 2025-01-09 at 12:56 +0100, Christian Herzog wrote:
>>Dear Chuck,
>>
>>On Wed, Jan 08, 2025 at 10:07:49AM -0500, Chuck Lever wrote:
>>>On 1/8/25 9:54 AM, Christian Herzog wrote:
>>>>Dear Chuck,
>>>>
>>>>On Wed, Jan 08, 2025 at 08:33:23AM -0500, Chuck Lever wrote:
>>>>>On 1/7/25 4:17 PM, Salvatore Bonaccorso wrote:
>>>>>>Hi Chuck,
>>>>>>
>>>>>>Thanks for your time on this, much appreciated.
>>>>>>
>>>>>>On Wed, Jan 01, 2025 at 02:24:50PM -0500, Chuck Lever wrote:
>>>>>>>On 12/25/24 4:15 AM, Salvatore Bonaccorso wrote:
>>>>>>>>Hi Chuck, hi all,
>>>>>>>>
>>>>>>>>[it was not ideal to pick one of the message for this 
followup, let me
>>>>>>>>know if you want a complete new thread, adding as well 
Benjamin and
>>>>>>>>Trond as they are involved in one mentioned patch]
>>>>>>>>
>>>>>>>>On Mon, Jun 17, 2024 at 02:31:54PM +0000, Chuck Lever III 
wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>On Jun 17, 2024, at 2:55 AM, Harald Dunkel 
<harald.dunkel@xxxxxxxxxx> wrote:
>>>>>>>>>>
>>>>>>>>>>Hi folks,
>>>>>>>>>>
>>>>>>>>>>what would be the reason for nfsd getting stuck somehow 
and becoming
>>>>>>>>>>an unkillable process? See
>>>>>>>>>>
>>>>>>>>>>- https://bugs.debian.org/cgi-bin/bugreport.cgi? 
bug=1071562 <https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1071562>
>>>>>>>>>>- https://bugs.launchpad.net/ubuntu/+source/nfs-utils/ 
+bug/2062568 <https://bugs.launchpad.net/ubuntu/+source/nfs-utils/ 
+bug/2062568>
>>>>>>>>>>
>>>>>>>>>>Doesn't this mean that something inside the kernel gets 
stuck as
>>>>>>>>>>well? Seems odd to me.
>>>>>>>>>
>>>>>>>>>I'm not familiar with the Debian or Ubuntu kernel packages. 
Can
>>>>>>>>>the kernel release numbers be translated to LTS kernel 
releases
>>>>>>>>>please? Need both "last known working" and "first broken" 
releases.
>>>>>>>>>
>>>>>>>>>This:
>>>>>>>>>
>>>>>>>>>[ 6596.911785] RPC: Could not send backchannel reply error: 
-110
>>>>>>>>>[ 6596.972490] RPC: Could not send backchannel reply error: 
-110
>>>>>>>>>[ 6837.281307] RPC: Could not send backchannel reply error: 
-110
>>>>>>>>>
>>>>>>>>>is a known set of client backchannel bugs. Knowing the LTS 
kernel
>>>>>>>>>releases (see above) will help us figure out what needs to be
>>>>>>>>>backported to the LTS kernels kernels in question.
>>>>>>>>>
>>>>>>>>>This:
>>>>>>>>>
>>>>>>>>>[11183.290619] wait_for_completion+0x88/0x150
>>>>>>>>>[11183.290623] __flush_workqueue+0x140/0x3e0
>>>>>>>>>[11183.290629] nfsd4_probe_callback_sync+0x1a/0x30 [nfsd]
>>>>>>>>>[11183.290689] nfsd4_destroy_session+0x186/0x260 [nfsd]
>>>>>>>>>
>>>>>>>>>is probably related to the backchannel errors on the 
client, but
>>>>>>>>>client bugs shouldn't cause the server to hang like this. We
>>>>>>>>>might be able to say more if you can provide the kernel 
release
>>>>>>>>>translations (see above).
>>>>>>>>
>>>>>>>>In Debian we hstill have the bug #1071562 open and one 
person notified
>>>>>>>>mye offlist that it appears that the issue get more frequent 
since
>>>>>>>>they updated on NFS client side from Ubuntu 20.04 to Debian 
bookworm
>>>>>>>>with a 6.1.y based kernel).
>>>>>>>>
>>>>>>>>Some people around those issues, seem to claim that the change
>>>>>>>>mentioned in
>>>>>>>>https://lists.proxmox.com/pipermail/pve-devel/2024- 
July/064614.html <https://lists.proxmox.com/pipermail/pve- 
devel/2024- July/064614.html>
>>>>>>>>would fix the issue, which is as well backchannel related.
>>>>>>>>
>>>>>>>>This is upstream: 6ddc9deacc13 ("SUNRPC: Fix backchannel reply,
>>>>>>>>again"). While this commit fixes 57331a59ac0d ("NFSv4.1: Use 
the
>>>>>>>>nfs_client's rpc timeouts for backchannel") this is not 
something
>>>>>>>>which goes back to 6.1.y, could it be possible that hte 
backchannel
>>>>>>>>refactoring and this final fix indeeds fixes the issue?
>>>>>>>>
>>>>>>>>As people report it is not easily reproducible, so this 
makes it
>>>>>>>>harder to identify fixes correctly.
>>>>>>>>
>>>>>>>>I gave a (short) stance on trying to backport commits up to
>>>>>>>>6ddc9deacc13 ("SUNRPC: Fix backchannel reply, again") but 
this quickly
>>>>>>>>seems to indicate it is probably still not the right thing for
>>>>>>>>backporting to the older stable series.
>>>>>>>>
>>>>>>>>As at least pre-requisites:
>>>>>>>>
>>>>>>>>2009e32997ed568a305cf9bc7bf27d22e0f6ccda
>>>>>>>>4119bd0306652776cb0b7caa3aea5b2a93aecb89
>>>>>>>>163cdfca341b76c958567ae0966bd3575c5c6192
>>>>>>>>f4afc8fead386c81fda2593ad6162271d26667f8
>>>>>>>>6ed8cdf967f7e9fc96cd1c129719ef99db2f9afc
>>>>>>>>57331a59ac0d680f606403eb24edd3c35aecba31
>>>>>>>>
>>>>>>>>and still there would be conflicting codepaths (and does not 
seem
>>>>>>>>right).
>>>>>>>>
>>>>>>>>Chuck, Benjamin, Trond, is there anything we can provive on 
reporters
>>>>>>>>side that we can try to tackle this issue better?
>>>>>>>
>>>>>>>As I've indicated before, NFSD should not block no matter 
what the
>>>>>>>client may or may not be doing. I'd like to focus on the 
server first.
>>>>>>>
>>>>>>>What is the result of:
>>>>>>>
>>>>>>>$ cd <Bookworm's v6.1.90 kernel source >
>>>>>>>$ unset KBUILD_OUTPUT
>>>>>>>$ make -j `nproc`
>>>>>>>$ scripts/faddr2line \
>>>>>>> fs/nfsd/nfs4state.o \
>>>>>>> nfsd4_destroy_session+0x16d
>>>>>>>
>>>>>>>Since this issue appeared after v6.1.1, is it possible to bisect
>>>>>>>between v6.1.1 and v6.1.90 ?
>>>>>>
>>>>>>First please note, at least speaking of triggering the issue in
>>>>>>Debian, Debian has moved to 6.1.119 based kernel already (and 
soon in
>>>>>>the weekends point release move to 6.1.123).
>>>>>>
>>>>>>That said, one of the users which regularly seems to be hit by 
the
>>>>>>issue was able to provide the above requested information, 
based for
>>>>>>6.1.119:
>>>>>>
>>>>>>~/kernel/linux-source-6.1# make kernelversion
>>>>>>6.1.119
>>>>>>~/kernel/linux-source-6.1# scripts/faddr2line fs/nfsd/ 
nfs4state.o nfsd4_destroy_session+0x16d
>>>>>>nfsd4_destroy_session+0x16d/0x250:
>>>>>>__list_del_entry at /root/kernel/linux-source-6.1/./include/ 
linux/list.h:134
>>>>>>(inlined by) list_del at /root/kernel/linux-source-6.1/./ 
include/linux/list.h:148
>>>>>>(inlined by) unhash_session at /root/kernel/linux-source-6.1/ 
fs/ nfsd/nfs4state.c:2062
>>>>>>(inlined by) nfsd4_destroy_session at /root/kernel/linux- 
source-6.1/fs/nfsd/nfs4state.c:3856
>>>>>>
>>>>>>They could provide as well a decode_stacktrace output for the 
recent
>>>>>>hit (if that is helpful for you):
>>>>>>
>>>>>>[Mon Jan 6 13:25:28 2025] INFO: task nfsd:55306 blocked for 
more than 6883 seconds.
>>>>>>[Mon Jan 6 13:25:28 2025]       Not tainted 6.1.0-28-amd64 #1 
Debian 6.1.119-1
>>>>>>[Mon Jan 6 13:25:28 2025] "echo 0 > /proc/sys/kernel/ 
hung_task_timeout_secs" disables this message.
>>>>>>[Mon Jan 6 13:25:28 2025] task:nfsd            state:D 
stack:0     pid:55306 ppid:2      flags:0x00004000
>>>>>>[Mon Jan 6 13:25:28 2025] Call Trace:
>>>>>>[Mon Jan 6 13:25:28 2025]  <TASK>
>>>>>>[Mon Jan 6 13:25:28 2025] __schedule+0x34d/0x9e0
>>>>>>[Mon Jan 6 13:25:28 2025] schedule+0x5a/0xd0
>>>>>>[Mon Jan 6 13:25:28 2025] schedule_timeout+0x118/0x150
>>>>>>[Mon Jan 6 13:25:28 2025] wait_for_completion+0x86/0x160
>>>>>>[Mon Jan 6 13:25:28 2025] __flush_workqueue+0x152/0x420
>>>>>>[Mon Jan 6 13:25:28 2025] nfsd4_destroy_session (debian/build/ 
build_amd64_none_amd64/include/linux/spinlock.h:351 debian/build/ 
build_amd64_none_amd64/fs/nfsd/nfs4state.c:3861) nfsd
>>>>>>[Mon Jan 6 13:25:28 2025] nfsd4_proc_compound (debian/build/ 
build_amd64_none_amd64/fs/nfsd/nfs4proc.c:2680) nfsd
>>>>>>[Mon Jan 6 13:25:28 2025] nfsd_dispatch (debian/build/ 
build_amd64_none_amd64/fs/nfsd/nfssvc.c:1022) nfsd
>>>>>>[Mon Jan 6 13:25:28 2025] svc_process_common (debian/build/ 
build_amd64_none_amd64/net/sunrpc/svc.c:1344) sunrpc
>>>>>>[Mon Jan 6 13:25:28 2025] ? svc_recv (debian/build/ 
build_amd64_none_amd64/net/sunrpc/svc_xprt.c:897) sunrpc
>>>>>>[Mon Jan 6 13:25:28 2025] ? nfsd_svc (debian/build/ 
build_amd64_none_amd64/fs/nfsd/nfssvc.c:983) nfsd
>>>>>>[Mon Jan 6 13:25:28 2025] ? nfsd_inet6addr_event (debian/ 
build/ build_amd64_none_amd64/fs/nfsd/nfssvc.c:922) nfsd
>>>>>>[Mon Jan 6 13:25:28 2025] svc_process (debian/build/ 
build_amd64_none_amd64/net/sunrpc/svc.c:1474) sunrpc
>>>>>>[Mon Jan 6 13:25:28 2025] nfsd (debian/build/ 
build_amd64_none_amd64/fs/nfsd/nfssvc.c:960) nfsd
>>>>>>[Mon Jan 6 13:25:28 2025] kthread+0xd7/0x100
>>>>>>[Mon Jan 6 13:25:28 2025] ? kthread_complete_and_exit+0x20/0x20
>>>>>>[Mon Jan 6 13:25:28 2025] ret_from_fork+0x1f/0x30
>>>>>>[Mon Jan  6 13:25:28 2025]  </TASK>
>>>>>>
>>>>>>The question about bisection is actually harder, those are 
production
>>>>>>systems and I understand it's not possible to do bisect there, 
while
>>>>>>unfortunately not reprodcing the issue on "lab conditions".
>>>>>>
>>>>>>Does the above give us still a clue on what you were looking for?
>>>>>
>>>>>Thanks for the update.
>>>>>
>>>>>It's possible that 38f080f3cd19 ("NFSD: Move callback_wq into 
struct
>>>>>nfs4_client"), while not a specific fix for this issue, might 
offer some
>>>>>relief by preventing the DESTROY_SESSION hang from affecting 
all other
>>>>>clients of the degraded server.
>>>>>
>>>>>Not having a reproducer or the ability to bisect puts a damper on
>>>>>things. The next step, then, is to enable tracing on servers 
where this
>>>>>issue can come up, and wait for the hang to occur. The 
following command
>>>>>captures information only about callback operation, so it 
should not
>>>>>generate much data or impact server performance.
>>>>>
>>>>>    # trace-cmd record -e nfsd:nfsd_cb\*
>>>>>
>>>>>Let that run until the problem occurs, then ^C, compress the 
resulting
>>>>>trace.dat file, and either attach it to 1071562 or email it to me
>>>>>privately.
>>>>thanks for the follow-up.
>>>>
>>>>I am the "customer" with two affected file servers. We have 
since moved those
>>>>servers to the backports kernel (6.11.10) in the hope of forward 
fixing the
>>>>issue. If this kernel is stable, I'm afraid I can't go back to 
the 'bad'
>>>>kernel (700+ researchers affected..), and we're also not able to 
trigger the
>>>>issue on our test environment.
>>>
>>>Hello Dr. Herzog -
>>>
>>>If the problem recurs on 6.11, the trace-cmd I suggest above works
>>>there as well.
>>the bad news is: it just happened again with the bpo kernel.
>>
>>We then immediately started trace-cmd since usually there are 
several call
>>traces in a row and we did get a trace.dat:
>>http://people.phys.ethz.ch/~daduke/trace.dat <http:// 
people.phys.ethz.ch/~daduke/trace.dat>
>>
>>we did notice however that the stack trace looked a bit different 
this time:
>>
>>     INFO: task nfsd:2566 blocked for more than 5799 seconds.
>>     Tainted: G        W          6.11.10+bpo-amd64 #1 Debia>
>>     "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables t>
>>     task:nfsd            state:D stack:0     pid:2566  tgid:2566 >
>>     Call Trace:
>>     <TASK>
>>     __schedule+0x400/0xad0
>>     schedule+0x27/0xf0
>>     nfsd4_shutdown_callback+0xfe/0x140 [nfsd]
>>     ? __pfx_var_wake_function+0x10/0x10
>>     __destroy_client+0x1f0/0x290 [nfsd]
>>     nfsd4_destroy_clientid+0xf1/0x1e0 [nfsd]
>>     ? svcauth_unix_set_client+0x586/0x5f0 [sunrpc]
>>     nfsd4_proc_compound+0x34d/0x670 [nfsd]
>>     nfsd_dispatch+0xfd/0x220 [nfsd]
>>     svc_process_common+0x2f7/0x700 [sunrpc]
>>     ? __pfx_nfsd_dispatch+0x10/0x10 [nfsd]
>>     svc_process+0x131/0x180 [sunrpc]
>>     svc_recv+0x830/0xa10 [sunrpc]
>>     ? __pfx_nfsd+0x10/0x10 [nfsd]
>>     nfsd+0x87/0xf0 [nfsd]
>>     kthread+0xcf/0x100
>>     ? __pfx_kthread+0x10/0x10
>>     ret_from_fork+0x31/0x50
>>     ? __pfx_kthread+0x10/0x10
>>     ret_from_fork_asm+0x1a/0x30
>>     </TASK>
>>
>>and also the state of the offending client in `/proc/fs/nfsd/ 
clients/*/info`
>>used to be callback state: UNKNOWN while now it is DOWN or FAULT. 
No idea
>>what it means, but I thought it was worth mentioning.
>>
>
>Looks like this is hung in nfsd41_cb_inflight_wait_complete() ? That
>probably means that the cl_cb_inflight counter is stuck at >0. I'm
>guessing that means that there is some callback that it's expecting to
>complete that isn't. From nfsd4_shutdown_callback():
>
>         /*
>          * Note this won't actually result in a null callback;
>          * instead, nfsd4_run_cb_null() will detect the killed
>          * client, destroy the rpc client, and stop:
>          */
>         nfsd4_run_cb(&clp->cl_cb_null);
>         flush_workqueue(clp->cl_callback_wq);
>         nfsd41_cb_inflight_wait_complete(clp);
>
>...it sounds like that isn't happening properly though.
>
>It might be interesting to see if you can track down the callback
>client in /sys/kernel/debug/sunrpc and see what it's doing.

Here's a possible scenario:

The trace.dat shows the server is sending a lot of CB_RECALL_ANY
operations.

Normally a callback occurs only due to some particular client request.
Such requests would be completed before a client unmounts.

CB_RECALL_ANY is an operation which does not occur due to a particular
client operation and can occur at any time. I believe this is the
first operation of this type we've added to NFSD.

My guess is that the server's laundromat has sent some CB_RECALL_ANY
operations while the client is umounting -- DESTROY_SESSION is
racing with those callback operations.

deleg_reaper() was backported to 6.1.y in 6.1.81, which lines up
more or less with when the issues started to show up.

A quick way to test this theory would be to make deleg_reaper() a
no-op. If this helps, then we know that shutting down the server's
callback client needs to be more careful about cleaning up outstanding
callbacks.

--
Chuck Lever

Hello.

Thanks a lot for all of this work !

I have started recording traces using :

# trace-cmd record -e nfsd:nfsd_cb\*

and currently waiting for the bug to trigger. This generally happen 
in one/two weeks for me.

If you want I adjust some recording parameters let me know.

After looking at your data and Rik's data, I decided to try reproducing
the problem in my lab. No luck so far.

The signatures of these issues are slightly different from each other:

v6.1:
- nfsd4_destroy_session hangs on the flush_workqueue in
   nfsd4_shutdown_callback()
- There might be outstanding callback RPCs that are not exiting
- Lots of CB_RECALL_ANY traffic. Only place that's generated in
   v6.1 is the NFSD shrinker when memory is short
- There appear to be some unreachable clients -- evidence of client
   crashes or network outages, perhaps

v6.12:
- destroy_client() hangs on the wait_var_event in
   nfsd4_shtudown_callback()
- There are no callback RPCs left after the hang
- There are some stuck clients listed under /proc/fs/nfsd/

Similar, but not the same.

I'm still open to clues, so please continue the regimen of starting
the trace-cmd when the server boots, and doing an "echo t >
/proc/sysrq-trigger" when the hang appears.

  # trace-cmd record -e nfsd:nfsd_cb\* -e sunrpc:svc_xprt_detach \
    -p function -l nfsd4_destroy_session

Hello Chuck.

Sorry for my late reply but I need to find something
interesting for you.

Did you see my call trace analyze here ?

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1071562

On v6.1 did you see that nfsd4_shutdown_callback is reached from
both nfsd4 laundromat_main and nfsd4_state_shrinker_worker ?
For me the hang in V6.1 come also from wait_var_event.

Here the state of my RAM on my server with some load.

                total        used        free      shared  buff/cache 
available
Mem:           32083        4982         651           5       26919   
27101
Swap:          16629         842       15787

My desktop clients use heavily suspend. The computer are almost
never rebooted. When a user logout, suspending is automatically 
launched. Do you think that some delegation can be accumulated,
locked by many suspended host, and can cause the
high memory/cache usage ?

v6.1 I think has courteous server support, meaning the server will
attempt to preserve all client state when the client stops sending
state-renewing operations.

Suspended clients probably don't clean up their network connections.
The server could see that as a network partition.

Still, the server is supposed to clear out an idle connection after
6 minutes. So these hangs should clear up if this were the only
issue.

Here my crash timeline between the desktop client salleprof13
(the one you think is unreachable) and the server fichdc02.
The failure come from suspend.

---
Client salleprof13 wake from sleep at 09:55:37
and was attached by the server.
The callback channel is initialized.
---

janv. 14 07:20:32 salleprofs13 kernel: PM: suspend entry (deep)

janv. 14 09:55:37 salleprofs13 kernel: PM: suspend exit
janv. 14 09:55:37 salleprofs13 ModemManager[621]: <info> [sleep-monitor- 
systemd] system is resuming
janv. 14 09:55:37 salleprofs13 NetworkManager[576]: <info> 
[1736844937.2382] manager: sleep: wake requested (sleeping: yes enabled: 
yes)
janv. 14 09:55:37 salleprofs13 systemd[1]: Stopping gssd-bug.service - 
Gssd credential cache hook...
janv. 14 09:55:37 salleprofs13 systemd[1]: Starting winbind.service - 
Samba Winbind Daemon...
janv. 14 09:55:37 salleprofs13 systemd[1]: Started winbind.service - 
Samba Winbind Daemon.
janv. 14 09:55:40 salleprofs13 NetworkManager[576]: <info> 
[1736844940.6304] device (enp2s0): carrier: link connected
janv. 14 09:55:40 salleprofs13 kernel: r8169 0000:02:00.0 enp2s0: Link 
is Up - 1Gbps/Full - flow control off

nfsd-1695  [047] ..... 190512.862590: nfsd_cb_probe: 
addr=172.16.118.121:0 client 67833e73:4dd32903 state=UNKNOWN
kworker/u96:8-225529 [047] ..... 190512.862782: nfsd_cb_setup: 
addr=172.16.118.121:0 client 67833e73:4dd32903 proto=tcp flavor=sys
kworker/u96:8-225529 [047] ..... 190512.862784: nfsd_cb_state: 
addr=172.16.118.121:0 client 67833e73:4dd32903 state=UP
2025-01-14T09:55:48.290546+01:00 fichdc02 rpc.mountd[1363]: v4.2 client 
attached: 0x4dd3290367833e73 from "172.16.118.121:825"

---
Someone logon on the station.
Immediately the server start sending nfsd_cb_recall_any.
Strange no ? What can be recalled on that host that just
wake up...

The server attempts to invoke CB_RECALL_ANY on all clients it knows
about. If it believes the client has state (even when in courtesy)
then the server will send this request to that client.

---

janv. 14 09:55:53 salleprofs13 nfsrahead[121742]: setting /dnfs/wine 
readahead to 128
janv. 14 09:56:13 salleprofs13 systemd-logind[569]: New session 36 of 
user xxxxxxxx.

kworker/u96:3-227781 [047] ..... 190541.974364: nfsd_cb_recall_any: 
addr=172.16.118.121:825 client 67833e73:4dd32903 keep=0 bmval0=RDATA_DLG
kworker/u96:2-220501 [013] ..... 190541.977732: nfsd_cb_recall_any_done: 
client 67833e73:4dd32903 status=0
kworker/u96:2-220501 [000] ..... 190550.387742: nfsd_cb_recall_any: 
addr=172.16.118.121:825 client 67833e73:4dd32903 keep=0 bmval0=RDATA_DLG
kworker/u96:0-228547 [003] ..... 190550.389602: nfsd_cb_recall_any_done: 
client 67833e73:4dd32903 status=0
kworker/u96:2-220501 [021] ..... 190565.367759: nfsd_cb_recall_any: 
addr=172.16.118.121:825 client 67833e73:4dd32903 keep=0 bmval0=RDATA_DLG
kworker/u96:1-230409 [000] ..... 190565.371391: nfsd_cb_recall_any_done: 
client 67833e73:4dd32903 status=0
kworker/u96:2-220501 [013] ..... 190579.748430: nfsd_cb_recall_any: 
addr=172.16.118.121:825 client 67833e73:4dd32903 keep=0 bmval0=RDATA_DLG
kworker/u96:8-225529 [026] ..... 190579.749946: nfsd_cb_recall_any_done: 
client 67833e73:4dd32903 status=0
kworker/u96:8-225529 [013] ..... 190592.749738: nfsd_cb_recall_any: 
addr=172.16.118.121:825 client 67833e73:4dd32903 keep=0 bmval0=RDATA_DLG
kworker/u96:2-220501 [001] ..... 190592.751192: nfsd_cb_recall_any_done: 
client 67833e73:4dd32903 status=0
kworker/u96:0-228547 [000] ..... 190599.849333: nfsd_cb_recall_any: 
addr=172.16.118.121:825 client 67833e73:4dd32903 keep=0 bmval0=RDATA_DLG
kworker/u96:0-228547 [000] ..... 190599.850872: nfsd_cb_recall_any_done: 
client 67833e73:4dd32903 status=0
kworker/u96:2-220501 [045] ..... 190686.760138: nfsd_cb_recall_any: 
addr=172.16.118.121:825 client 67833e73:4dd32903 keep=0 bmval0=RDATA_DLG
kworker/u96:1-230409 [014] ..... 190686.761181: nfsd_cb_recall_any_done: 
client 67833e73:4dd32903 status=0
kworker/u96:1-230409 [013] ..... 190694.103480: nfsd_cb_recall_any: 
addr=172.16.118.121:825 client 67833e73:4dd32903 keep=0 bmval0=RDATA_DLG
kworker/u96:8-225529 [022] ..... 190694.106120: nfsd_cb_recall_any_done: 
client 67833e73:4dd32903 status=0
kworker/u96:8-225529 [000] ..... 190749.678378: nfsd_cb_recall_any: 
addr=172.16.118.121:825 client 67833e73:4dd32903 keep=0 bmval0=RDATA_DLG
kworker/u96:2-220501 [007] ..... 190749.679870: nfsd_cb_recall_any_done: 
client 67833e73:4dd32903 status=0
kworker/u96:4-230705 [014] ..... 190782.830144: nfsd_cb_recall_any: 
addr=172.16.118.121:825 client 67833e73:4dd32903 keep=0 bmval0=RDATA_DLG
kworker/u96:1-230409 [010] ..... 190782.831220: nfsd_cb_recall_any_done: 
client 67833e73:4dd32903 status=0

---
Next the user suspend the host maybe by pressing
the power button without login out so the last
nfsd_cb_recall_any fail
Maybe the "unrecognized reply" related ?
---

janv. 14 10:01:18 salleprofs13 systemd-logind[569]: The system will 
suspend now!
janv. 14 10:01:18 salleprofs13 NetworkManager[576]: <info> 
[1736845278.5631] manager: sleep: sleep requested (sleeping: no enabled: 
yes)
janv. 14 10:01:18 salleprofs13 ModemManager[621]: <info> [sleep-monitor- 
systemd] system is about to suspend
janv. 14 10:01:20 salleprofs13 kernel: PM: suspend entry (deep)

kworker/u96:2-220501 [047] ..... 190851.843698: nfsd_cb_recall_any: 
addr=172.16.118.121:825 client 67833e73:4dd32903 keep=0 bmval0=RDATA_DLG
kworker/u96:2-220501 [047] ..... 190861.062828: nfsd_cb_state: 
addr=172.16.118.121:0 client 67833e73:4dd32903 state=FAULT

2025-01-14T10:06:06.586140+01:00 fichdc02 kernel: [191131.106081] 
receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 
00000000ad57929d xid 52751431

---
Got 5 unrecognized reply.
Seems not related .
---

2025-01-14T10:06:56.762137+01:00 fichdc02 kernel: [191181.279788] 
receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 
000000002dcd99c8 xid 5d819d7d
2025-01-14T10:06:56.762197+01:00 fichdc02 kernel: [191181.280005] 
receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 
000000002dcd99c8 xid 61819d7d
2025-01-14T10:06:56.762201+01:00 fichdc02 kernel: [191181.280054] 
receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 
000000002dcd99c8 xid 60819d7d
2025-01-14T10:06:56.762206+01:00 fichdc02 kernel: [191181.280223] 
receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 
000000002dcd99c8 xid 5f819d7d
2025-01-14T10:06:56.762209+01:00 fichdc02 kernel: [191181.280364] 
receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 
000000002dcd99c8 xid 5e819d7d

---
Just 10 minutes after suspend the host wake up (with xxxxxxxx still 
logged on).
The gssd-bug.service is a custom script that kill all the user processes
if the kerberos cache file has expired. This is due to a rpc.gssd bug
that hammer the server when the cache has expired during suspend.
It run "killall -u xxxxxxxx". But here, I don't know if the killall
was called or not.

Just after the wake. The server send again an nfsd_cb_recall_any. But 
this time
there is no nfsd_cb_state with state=FAULT. From this time there no more 
kworker
cb calls. I think this where the hang start.

I'm thinking the hang started a little earlier, but I agree that this is
the first symptom of it.

I was attempting to reproduce by periodically dropping the connection,
but it appears there needs to be one or two more steps to make it behave
like a suspended client. I will see what can be done.

And we got again an unrecognized reply.
---

janv. 14 10:10:24 salleprofs13 kernel: PM: suspend exit
janv. 14 10:10:24 salleprofs13 systemd-sleep[124910]: System returned 
from sleep state.
janv. 14 10:10:24 salleprofs13 systemd[1]: Stopping gssd-bug.service - 
Gssd credential cache hook...
janv. 14 10:10:24 salleprofs13 NetworkManager[576]: <info> 
[1736845824.4117] manager: sleep: wake requested (sleeping: yes enabled: 
yes)
janv. 14 10:10:24 salleprofs13 check-gssd-cache[124962]: id: 
« xxxxxxxx » : the user don't exist
janv. 14 10:10:24 salleprofs13 systemd[1]: Started winbind.service - 
Samba Winbind Daemon.
janv. 14 10:10:27 salleprofs13 kernel: r8169 0000:02:00.0 enp2s0: Link 
is Up - 1Gbps/Full - flow control off
janv. 14 10:10:27 salleprofs13 NetworkManager[576]: <info> 
[1736845827.6412] device (enp2s0): carrier: link connected
janv. 14 10:10:35 salleprofs13 systemd[1]: Stopped gssd-bug.service - 
Gssd credential cache hook.

kworker/u96:2-220501 [012] ..... 191417.544355: nfsd_cb_recall_any: 
addr=172.16.118.121:825 client 67833e73:4dd32903 keep=0 bmval0=RDATA_DLG

2025-01-14T10:10:58.426113+01:00 fichdc02 kernel: [191422.934976] 
receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 
000000007205cdb2 xid ee91a9d8

---
nfsd probe a failed State. Normal.
But as kworker hang, it can't reinit the callback channel
after the nfsd_cb_state "state=UNKNOWN".

I don't know why but nfsd started also
to send nfsd_cb_recall...
Note that is less than 120 second before the "BLOCKED" message.
so the thread may be already locked.
---

nfsd-1690  [014] ..... 191424.483249: nfsd_cb_probe: 
addr=172.16.118.121:0 client 67833e73:4dd32903 state=FAULT
nfsd-1690  [014] ..... 191424.483252: nfsd_cb_state: 
addr=172.16.118.121:0 client 67833e73:4dd32903 state=UNKNOWN
nfsd-1695  [013] ...1. 191441.265047: nfsd_cb_recall: 
addr=172.16.118.121:0 client 67833e73:4dd32903 stateid 00001c6e:00000001
nfsd-1695  [014] ...1. 191441.646104: nfsd_cb_recall: 
addr=172.16.118.121:0 client 67833e73:4dd32903 stateid 000015fc:00000001

---
Got some strange logs on the client
And 7 unrecognized reply on the server that seems not related.
---

janv. 14 10:11:00 salleprofs13 PackageKit[122249]: daemon quit
janv. 14 10:11:00 salleprofs13 systemd[1]: packagekit.service: 
Deactivated successfully.
janv. 14 10:11:00 salleprofs13 systemd[1]: packagekit.service: Consumed 
28.223s CPU time.
janv. 14 10:11:02 salleprofs13 gnome-shell[122127]: JS ERROR: 
Gio.IOErrorEnum: Erreur lors de l’appel de StartServiceByName pour 
org.vmware.viewagent.Credentials : Le délai d’>

_injectToMethod/klass[method]@resource:///org/gnome/gjs/modules/core/ 
overrides/Gio.js:287:25

VmwareCredentials@resource:///org/gnome/shell/gdm/vmware.js:34:10

VmwareCredentialsManager@resource:///org/gnome/shell/gdm/vmware.js:41:29

getVmwareCredentialsManager@resource:///org/gnome/shell/gdm/vmware.js:51:37

ShellUserVerifier@resource:///org/gnome/shell/gdm/util.js:183:64

_init@resource:///org/gnome/shell/gdm/authPrompt.js:74:30

AuthPrompt@resource:///org/gnome/shell/gdm/authPrompt.js:51:4

_ensureAuthPrompt@resource:///org/gnome/shell/ui/unlockDialog.js:686:32

_showPrompt@resource:///org/gnome/shell/ui/unlockDialog.js:725:14

vfunc_key_press_event@resource:///org/gnome/shell/ui/unlockDialog.js:620:14
janv. 14 10:11:05 salleprofs13 gsd-power[122329]: Error setting property 
'PowerSaveMode' on interface org.gnome.Mutter.DisplayConfig: Le délai 
d’attente est dépassé (g-io-erro>
janv. 14 10:11:27 salleprofs13 gnome-shell[122127]: JS ERROR: 
Gio.IOErrorEnum: Erreur lors de l’appel de StartServiceByName pour 
org.vmware.viewagent.Credentials : Le délai d’>

_injectToMethod/klass[method]@resource:///org/gnome/gjs/modules/core/ 
overrides/Gio.js:287:25

VmwareCredentials@resource:///org/gnome/shell/gdm/vmware.js:34:10

VmwareCredentialsManager@resource:///org/gnome/shell/gdm/vmware.js:41:29

getVmwareCredentialsManager@resource:///org/gnome/shell/gdm/vmware.js:51:37

ShellUserVerifier@resource:///org/gnome/shell/gdm/util.js:183:64

_init@resource:///org/gnome/shell/gdm/authPrompt.js:74:30

AuthPrompt@resource:///org/gnome/shell/gdm/authPrompt.js:51:4

_ensureAuthPrompt@resource:///org/gnome/shell/ui/unlockDialog.js:686:32

_showPrompt@resource:///org/gnome/shell/ui/unlockDialog.js:725:14

vfunc_key_press_event@resource:///org/gnome/shell/ui/unlockDialog.js:620:14

2025-01-14T10:12:36.730114+01:00 fichdc02 kernel: [191521.235822] 
receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 
00000000191d4459 xid d591b1ee
2025-01-14T10:12:36.730165+01:00 fichdc02 kernel: [191521.236547] 
receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 
00000000191d4459 xid d691b1ee
2025-01-14T10:12:36.730169+01:00 fichdc02 kernel: [191521.236666] 
receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 
00000000191d4459 xid da91b1ee
2025-01-14T10:12:36.730171+01:00 fichdc02 kernel: [191521.236743] 
receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 
00000000191d4459 xid db91b1ee
2025-01-14T10:12:36.730172+01:00 fichdc02 kernel: [191521.236769] 
receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 
00000000191d4459 xid d991b1ee
2025-01-14T10:12:36.730174+01:00 fichdc02 kernel: [191521.236824] 
receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 
00000000191d4459 xid d891b1ee
2025-01-14T10:12:36.730177+01:00 fichdc02 kernel: [191521.236840] 
receive_cb_reply: Got unrecognized reply: calldir 0x1 xpt_bc_xprt 
00000000191d4459 xid d791b1ee

---
And finally the hang syslog message
---

2025-01-14T10:14:29.370164+01:00 fichdc02 kernel: [191633.871204] INFO: 
task nfsd:1690 blocked for more than 120 seconds.
2025-01-14T10:14:29.374846+01:00 fichdc02 kernel: [191633.875410] INFO: 
task nfsd:1691 blocked for more than 120 seconds.
2025-01-14T10:14:29.374966+01:00 fichdc02 kernel: [191633.878106] INFO: 
task nfsd:1693 blocked for more than 120 seconds.
2025-01-14T10:14:29.378303+01:00 fichdc02 kernel: [191633.880748] INFO: 
task nfsd:1695 blocked for more than 120 seconds.

Hope this help.

I'm still recording.

Regards,

Baptiste.

--
Chuck Lever