NeilBrown wrote:
We can guess though. It isn't waiting for a lock - that would show in the above list - so it might be waiting for a wakeup, or might be spinning. The only wake-up I can imagine is in one of the memory-allocation calls, but if the system were running out of memory we would probably see messages about that.
I have seen something like this. I am running NFS inside a container, using legacy cgroup. When it got stuck it claimed I cannot login into the container due to out of memory. When it happens again, I can send you the exact error message. The next hung nfsd is overdue, anyway.
I wonder if it could be looping in svc_xprt_destroy_all(), and sitting in the msleep() when the hang is detected so there are no locks to report. I can't see while it would block there. It would really help to get a full task list. There is a sysctl for that: /proc/sys/kernel/hung_task_all_cpu_backtrace Could that be enabled?
I have enabled it on my NFS server (echo 1 >/proc/.../hung_task_all_cpu_backtrace). Regards Harri