On Mon, Jul 29, 2024 at 04:50:27PM -0500, Steve French wrote: > On Mon, Jul 29, 2024 at 4:50 AM Christian Brauner <brauner@xxxxxxxxxx> wrote: > On Mon, Jul 29, 2024 at 12:33 PM Steve French <smfrench@xxxxxxxxx> wrote: > > > The first step should be to identify what exactly keeps your mount busy > > > in generic/044 and generic/043. > > > > That is a little tricky to debug (AFAIK no easy way to tell exactly which > > reference is preventing the VFS from proceeding with the umount and > > calling kill_sb). My best guess is something related to deferred close > > (cached network file handles) that had a brief refcount on > > something being checked by umount, but when I experimented with > > deferred close settings that did not seem to affect the problem so > > looking for other possible causes. > > > > I just did a quick experiment by adding a 1 second wait inside umount > > and confirmed that that does fix it for those two tests when mounted to Samba, > > but not clear why the slight delay in umount helps as there is no pending > > network traffic at that point. > > I did some more experimentation and it looks like the umount problem > with those two xfstests to Samba is related to IOC_SHUTDOWN. > If I return EOPNOTSUPP on IOC_SHUTDOWN > then the 1 second delay in umount is not necessary - so something that > happens after IOC_SHUTDOWN races with umount (thus the 1 second delay > that I tried as a quick experiment fixes it indirectly) in this > testcase (although > apparently this race between IOC_SHUTDOWN and umount is not an issue > to some other servers but is reproducible to Samba and ksmbd (at least > in some easy to setup configurations) So you've likely got a race condition where something takes longer when the shutdown flag is set then when the filesystem is operating normally. There's not a lot in the CIFS code that pays attention to the shutdown flag - almost all of them are aborting front end (syscall) operations before they are started. The only back end check appears to be in cifs_issue_write(). Perhaps that is failing to wake the request queue when it is being failed with -EIO on a shutdown, and so it takes some time for something else to wake it up and empty it and complete the pending writes before the fs can be unmounted... -Dave. -- Dave Chinner david@xxxxxxxxxxxxx