Chuck Lever writes via Kernel.org Bugzilla: (In reply to Baptiste PELLEGRIN from comment #8) > I always see one or two "unrecognized reply" message around 120 seconds > before the hang message. > > So it may something that happen on client or server weekly jobs ? > Or maybe some memory leak or cache corruption ? > Or something related to expired Kerberos cache file ? > Or expired NFS session ? > ... > > It seems also that the number of nfsd_cb_recall_any callback message > increase with the server uptime. This seems in favor of the memory leak > hypothesis. The server generates a CB_RECALL_ANY message for each active client. If the number of active clients increases from zero at server boot time to a few dozen, that would also explain why you see more of these over time. If your NFS server does not also have NFS mount points, a few client-side trace points can be enabled to capture more details about NFSv4 callback activity. "-e sunrpc:xprt_reserve" for example would help us match the XIDs in the callback operations to the messages you see in the server's system journal. View: https://bugzilla.kernel.org/show_bug.cgi?id=219710#c9 You can reply to this message to join the discussion. -- Deet-doot-dot, I am a bot. Kernel.org Bugzilla (bugspray 0.1-dev)