Re: [PATCH v2 4/4] smb: During unmount, ensure all cached dir instances drop their dentry

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Paul,

Thanks for looking into this!  Really appreciate it.

Paul Aurich <paul@xxxxxxxxxxxxxx> writes:

> The unmount process (cifs_kill_sb() calling close_all_cached_dirs()) can
> race with various cached directory operations, which ultimately results
> in dentries not being dropped and these kernel BUGs:
>
> BUG: Dentry ffff88814f37e358{i=1000000000080,n=/}  still in use (2) [unmount of cifs cifs]
> VFS: Busy inodes after unmount of cifs (cifs)
> ------------[ cut here ]------------
> kernel BUG at fs/super.c:661!
>
> This happens when a cfid is in the process of being cleaned up when, and
> has been removed from the cfids->entries list, including:
>
> - Receiving a lease break from the server
> - Server reconnection triggers invalidate_all_cached_dirs(), which
>   removes all the cfids from the list
> - The laundromat thread decides to expire an old cfid.
>
> To solve these problems, dropping the dentry is done in queued work done
> in a newly-added cfid_put_wq workqueue, and close_all_cached_dirs()
> flushes that workqueue after it drops all the dentries of which it's
> aware. This is a global workqueue (rather than scoped to a mount), but
> the queued work is minimal.

Why does it need to be a global workqueue?  Can't you make it per tcon?

> The final cleanup work for cleaning up a cfid is performed via work
> queued in the serverclose_wq workqueue; this is done separate from
> dropping the dentries so that close_all_cached_dirs() doesn't block on
> any server operations.
>
> Both of these queued works expect to invoked with a cfid reference and
> a tcon reference to avoid those objects from being freed while the work
> is ongoing.

Why do you need to take a tcon reference?  Can't you drop the dentries
when tearing down tcon in cifs_put_tcon()?  No concurrent mounts would
be able to access or free it.

After running xfstests I've seen a leaked tcon in
/proc/fs/cifs/DebugData with no CIFS superblocks, which might be related
to this.

Could you please check if there is any leaked connection in
/proc/fs/cifs/DebugData after running your tests?




[Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux