Hi Paul, Thanks for looking into this! Really appreciate it. Paul Aurich <paul@xxxxxxxxxxxxxx> writes: > The unmount process (cifs_kill_sb() calling close_all_cached_dirs()) can > race with various cached directory operations, which ultimately results > in dentries not being dropped and these kernel BUGs: > > BUG: Dentry ffff88814f37e358{i=1000000000080,n=/} still in use (2) [unmount of cifs cifs] > VFS: Busy inodes after unmount of cifs (cifs) > ------------[ cut here ]------------ > kernel BUG at fs/super.c:661! > > This happens when a cfid is in the process of being cleaned up when, and > has been removed from the cfids->entries list, including: > > - Receiving a lease break from the server > - Server reconnection triggers invalidate_all_cached_dirs(), which > removes all the cfids from the list > - The laundromat thread decides to expire an old cfid. > > To solve these problems, dropping the dentry is done in queued work done > in a newly-added cfid_put_wq workqueue, and close_all_cached_dirs() > flushes that workqueue after it drops all the dentries of which it's > aware. This is a global workqueue (rather than scoped to a mount), but > the queued work is minimal. Why does it need to be a global workqueue? Can't you make it per tcon? > The final cleanup work for cleaning up a cfid is performed via work > queued in the serverclose_wq workqueue; this is done separate from > dropping the dentries so that close_all_cached_dirs() doesn't block on > any server operations. > > Both of these queued works expect to invoked with a cfid reference and > a tcon reference to avoid those objects from being freed while the work > is ongoing. Why do you need to take a tcon reference? Can't you drop the dentries when tearing down tcon in cifs_put_tcon()? No concurrent mounts would be able to access or free it. After running xfstests I've seen a leaked tcon in /proc/fs/cifs/DebugData with no CIFS superblocks, which might be related to this. Could you please check if there is any leaked connection in /proc/fs/cifs/DebugData after running your tests?