On 8/3/19 3:22 PM, Trond Myklebust wrote: > On Sat, 2019-08-03 at 12:07 -0700, John Hubbard wrote: >> On 8/3/19 7:40 AM, Trond Myklebust wrote: >>> John Hubbard reports seeing the following stack trace: >>> >>> nfs4_do_reclaim >>> rcu_read_lock /* we are now in_atomic() and must not sleep */ >>> nfs4_purge_state_owners >>> nfs4_free_state_owner >>> nfs4_destroy_seqid_counter >>> rpc_destroy_wait_queue >>> cancel_delayed_work_sync >>> __cancel_work_timer >>> __flush_work >>> start_flush_work >>> might_sleep: >>> (kernel/workqueue.c:2975: >>> BUG) >>> >>> The solution is to separate out the freeing of the state owners >>> from nfs4_purge_state_owners(), and perform that outside the atomic >>> context. >>> >> >> All better now--this definitely fixes it. I can reboot the server, >> and >> of course that backtrace is gone. Then the client mounts hang, so I >> do >> a mount -a -o remount, and after about 1 minute, the client mounts >> start working again, with no indication of problems. I assume that >> the >> pause is by design--timing out somewhere, to recover from the server >> going missing for a while. If so, then all is well. >> > > Thanks very much for the report, and for testing! > > With regards to the 1 minute delay, I strongly suspect that what you > are seeing is the NFSv4 "grace period". > > After a NFSv4.x server reboot, the clients are given a certain amount > of time in which to recover the file open state and lock state that > they may have held before the reboot. All non-recovery opens, locks and > all I/O are stopped while this recovery process is happening to ensure > that locking conflicts do not occur. This ensures that all locks can > survive server reboots without any loss of atomicity. > > With NFSv4.1 and NFSv4.2, the server can determine when all the clients > have finished recovering state and end the grace period early, however > I've recently seen cases where that was not happening. I'm not sure yet > if that is a real server regression. > Aha, thanks for explaining. It's great to see such refined behavior now from NFS, definitely enjoying v4! :) thanks, -- John Hubbard NVIDIA