Re: [RFC v3 1/1] fs/namespace: remove RCU sync for MNT_DETACH umount

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 3/7/24 17:22, Christian Brauner wrote:
On Mon, Jul 01, 2024 at 02:10:31PM GMT, Christian Brauner wrote:
On Mon, Jul 01, 2024 at 10:41:40AM GMT, Alexander Larsson wrote:
On Mon, Jul 1, 2024 at 7:50 AM Christian Brauner <brauner@xxxxxxxxxx> wrote:
I always thought the rcu delay was to ensure concurrent path walks "see" the

umount not to ensure correct operation of the following mntput()(s).


Isn't the sequence of operations roughly, resolve path, lock, deatch,
release

lock, rcu wait, mntput() subordinate mounts, put path.
The crucial bit is really that synchronize_rcu_expedited() ensures that
the final mntput() won't happen until path walk leaves RCU mode.

This allows caller's like legitimize_mnt() which are called with only
the RCU read-lock during lazy path walk to simple check for
MNT_SYNC_UMOUNT and see that the mnt is about to be killed. If they see
that this mount is MNT_SYNC_UMOUNT then they know that the mount won't
be freed until an RCU grace period is up and so they know that they can
simply put the reference count they took _without having to actually
call mntput()_.

Because if they did have to call mntput() they might end up shutting the
filesystem down instead of umount() and that will cause said EBUSY
errors I mentioned in my earlier mails.
But such behaviour could be kept even without an expedited RCU sync.
Such as in my alternative patch for this:
https://www.spinics.net/lists/linux-fsdevel/msg270117.html

I.e. we would still guarantee the final mput is called, but not block
the return of the unmount call.
That's fine but the patch as sent doesn't work is my point. It'll cause
exactly the issues described earlier, no? So I'm confused why this
version simply ended up removing synchronize_rcu_expedited() when
the proposed soluton seems to have been to use queue_rcu_work().

But anyway, my concern with this is still that this changes the way
MNT_DETACH behaves when you shut down a non-busy filesystem with
MNT_DETACH as outlined in my other mail.

If you find a workable version I'm not entirely opposed to try this but
I wouldn't be surprised if this causes user visible issues for anyone
that uses MNT_DETACH on a non-used filesystem.
Correction: I misremembered that umount_tree() is called with
UMOUNT_SYNC only in the case that umount() isn't called with MNT_DETACH.
I mentioned this yesterday in the thread but just in case you missed it
I want to spell it out in detail as well.

Thanks Christian, I did see that, yep.

There's also the seqlock in there to alert the legitimize that it needs

to restart using ref-walk.



This is relevant because UMOUNT_SYNC will raise MNT_SYNC_UMOUNT on all
mounts it unmounts. And that ends up being checked in legitimize_mnt()
to ensure that legitimize_mnt() doesn't call mntput() during path lookup
and risking EBUSY for a umount(..., 0) + mount() sequence for the same
filesystem.

But for umount(.., MNT_DETACH) UMOUNT_SYNC isn't passed and so
MNT_SYNC_UMOUNT isn't raised on the mount and so legitimize_mnt() may
end up doing the last mntput() and cleaning up the filesystem.

In other words, a umount(..., MNT_DETACH) caller needs to be prepared to
deal with EBUSY for a umount(..., MNT_DETACH) + mount() sequence.

So I think we can certainly try this as long as we make it via
queue_rcu_work() to handle the other mntput_no_expire() grace period
dependency we discussed upthread.

Thanks for making take a closer look.

I'm still not sure I fully understand the subtleties of how this works, I

think I'll need to do a deep dive into the rcu code and then revisit the

umount code. At least I won't be idle, ;(


Nevertheless I have to thank both you and Honza for your efforts and tolerance.


Ian

Ian






[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux