If the state manager thread is not actually able to fully recover from some situation, it wakes up waiters, who kick off a new state manager thread. Quite often the fresh invocation of the state manager is just as successful. This results in a livelock as the client dumps thousands of NFS requests a second on the network in a vain attempt to recover. Not very friendly. To mitigate this situation, add a delay in the state manager after an unhandled error, so that the client sends just a few requests every second in this case. --- fs/nfs/nfs4state.c | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c index 5e3bebc..38959eb 100644 --- a/fs/nfs/nfs4state.c +++ b/fs/nfs/nfs4state.c @@ -2151,6 +2151,7 @@ static void nfs4_state_manager(struct nfs_client *clp) out_error: pr_warn_ratelimited("NFS: state manager failed on NFSv4 server %s" " with error %d\n", clp->cl_hostname, -status); + ssleep(1); nfs4_end_drain_session(clp); nfs4_clear_state_manager_bit(clp); } -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html