On Mon, Jun 19, 2023 at 01:05:26PM +0200, Jan Kara wrote: > On Sat 17-06-23 09:33:42, Dave Chinner wrote: > > On Fri, Jun 16, 2023 at 06:38:27PM +0200, Jan Kara wrote: > > > Provide helpers to set and clear sb->s_readonly_remount including > > > appropriate memory barriers. Also use this opportunity to document what > > > the barriers pair with and why they are needed. > > > > > > Suggested-by: Dave Chinner <david@xxxxxxxxxxxxx> > > > Signed-off-by: Jan Kara <jack@xxxxxxx> > > > > The helper conversion looks fine so from that perspective the patch > > looks good. > > > > However, I'm not sure the use of memory barriers is correct, though. > > AFAICS, the barriers are correct but my documentation was not ;) > Christian's reply has all the details but maybe let me attempt a bit more > targetted reply here. *nod* > > > IIUC, we want mnt_is_readonly() to return true when ever > > s_readonly_remount is set. Is that the behaviour we are trying to > > acheive for both ro->rw and rw->ro transactions? > > Yes. But what matters is the ordering of s_readonly_remount checking wrt > other flags. See below. > > > > --- > > > fs/internal.h | 26 ++++++++++++++++++++++++++ > > > fs/namespace.c | 10 ++++------ > > > fs/super.c | 17 ++++++----------- > > > include/linux/fs.h | 2 +- > > > 4 files changed, 37 insertions(+), 18 deletions(-) > > > > > > diff --git a/fs/internal.h b/fs/internal.h > > > index bd3b2810a36b..01bff3f6db79 100644 > > > --- a/fs/internal.h > > > +++ b/fs/internal.h > > > @@ -120,6 +120,32 @@ void put_super(struct super_block *sb); > > > extern bool mount_capable(struct fs_context *); > > > int sb_init_dio_done_wq(struct super_block *sb); > > > > > > +/* > > > + * Prepare superblock for changing its read-only state (i.e., either remount > > > + * read-write superblock read-only or vice versa). After this function returns > > > + * mnt_is_readonly() will return true for any mount of the superblock if its > > > + * caller is able to observe any changes done by the remount. This holds until > > > + * sb_end_ro_state_change() is called. > > > + */ > > > +static inline void sb_start_ro_state_change(struct super_block *sb) > > > +{ > > > + WRITE_ONCE(sb->s_readonly_remount, 1); > > > + /* The barrier pairs with the barrier in mnt_is_readonly() */ > > > + smp_wmb(); > > > +} > > > > I'm not sure how this wmb pairs with the memory barrier in > > mnt_is_readonly() to provide the correct behavior. The barrier in > > mnt_is_readonly() happens after it checks s_readonly_remount, so > > the s_readonly_remount in mnt_is_readonly is not ordered in any way > > against this barrier. > > > > The barrier in mnt_is_readonly() ensures that the loads of SB_RDONLY > > and MNT_READONLY are ordered after s_readonly_remount(), but we > > don't change those flags until a long way after s_readonly_remount > > is set. > > You are correct. I've reread the code and the ordering that matters is > __mnt_want_write() on the read side and reconfigure_super() on the write > side. In particular for RW->RO transition we must make sure that: If > __mnt_want_write() does not see MNT_WRITE_HOLD set, it will see > s_readonly_remount set. There is another set of barriers in those functions > that makes sure sb_prepare_remount_readonly() sees incremented mnt_writers > if __mnt_want_write() did not see MNT_WRITE_HOLD set, but that's a > different story. Yup, as I said to Christian, there is nothing in the old or new code that even hints at an interaction with MNT_WRITE_HOLD or __mnt_want_write() here. I couldn't make that jump from reading the code, and so the memory barrier placement made no sense at all. > Hence the barrier in sb_start_ro_state_change() pairs with > smp_rmb() barrier in __mnt_want_write() before the > mnt_is_readonly() check at the end of the function. I'll fix my > patch, thanks for correction. Please also update the mnt_[un]hold_writers() and __mnt_want_write() documentation to also point at the new sb_start/end_ro_state_change helpers, as all the memory barriers in this code are tightly coupled. Thanks! -Dave. -- Dave Chinner david@xxxxxxxxxxxxx