Re: [PATCH] fs: Provide helpers for manipulating sb->s_readonly_remount

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat 17-06-23 09:33:42, Dave Chinner wrote:
> On Fri, Jun 16, 2023 at 06:38:27PM +0200, Jan Kara wrote:
> > Provide helpers to set and clear sb->s_readonly_remount including
> > appropriate memory barriers. Also use this opportunity to document what
> > the barriers pair with and why they are needed.
> > 
> > Suggested-by: Dave Chinner <david@xxxxxxxxxxxxx>
> > Signed-off-by: Jan Kara <jack@xxxxxxx>
> 
> The helper conversion looks fine so from that perspective the patch
> looks good.
> 
> However, I'm not sure the use of memory barriers is correct, though.

AFAICS, the barriers are correct but my documentation was not ;)
Christian's reply has all the details but maybe let me attempt a bit more
targetted reply here.

> IIUC, we want mnt_is_readonly() to return true when ever
> s_readonly_remount is set. Is that the behaviour we are trying to
> acheive for both ro->rw and rw->ro transactions?

Yes. But what matters is the ordering of s_readonly_remount checking wrt
other flags. See below.

> > ---
> >  fs/internal.h      | 26 ++++++++++++++++++++++++++
> >  fs/namespace.c     | 10 ++++------
> >  fs/super.c         | 17 ++++++-----------
> >  include/linux/fs.h |  2 +-
> >  4 files changed, 37 insertions(+), 18 deletions(-)
> > 
> > diff --git a/fs/internal.h b/fs/internal.h
> > index bd3b2810a36b..01bff3f6db79 100644
> > --- a/fs/internal.h
> > +++ b/fs/internal.h
> > @@ -120,6 +120,32 @@ void put_super(struct super_block *sb);
> >  extern bool mount_capable(struct fs_context *);
> >  int sb_init_dio_done_wq(struct super_block *sb);
> >  
> > +/*
> > + * Prepare superblock for changing its read-only state (i.e., either remount
> > + * read-write superblock read-only or vice versa). After this function returns
> > + * mnt_is_readonly() will return true for any mount of the superblock if its
> > + * caller is able to observe any changes done by the remount. This holds until
> > + * sb_end_ro_state_change() is called.
> > + */
> > +static inline void sb_start_ro_state_change(struct super_block *sb)
> > +{
> > +	WRITE_ONCE(sb->s_readonly_remount, 1);
> > +	/* The barrier pairs with the barrier in mnt_is_readonly() */
> > +	smp_wmb();
> > +}
> 
> I'm not sure how this wmb pairs with the memory barrier in
> mnt_is_readonly() to provide the correct behavior. The barrier in
> mnt_is_readonly() happens after it checks s_readonly_remount, so
> the s_readonly_remount in mnt_is_readonly is not ordered in any way
> against this barrier.
> 
> The barrier in mnt_is_readonly() ensures that the loads of SB_RDONLY
> and MNT_READONLY are ordered after s_readonly_remount(), but we
> don't change those flags until a long way after s_readonly_remount
> is set.

You are correct. I've reread the code and the ordering that matters is
__mnt_want_write() on the read side and reconfigure_super() on the write
side. In particular for RW->RO transition we must make sure that: If
__mnt_want_write() does not see MNT_WRITE_HOLD set, it will see
s_readonly_remount set. There is another set of barriers in those functions
that makes sure sb_prepare_remount_readonly() sees incremented mnt_writers
if __mnt_want_write() did not see MNT_WRITE_HOLD set, but that's a
different story.

Hence the barrier in sb_start_ro_state_change() pairs with smp_rmb()
barrier in __mnt_want_write() before the mnt_is_readonly() check at the end
of the function. I'll fix my patch, thanks for correction.

> Hence if this is a ro->rw transistion, then I can see that racing on
> s_readonly_remount being isn't an issue, because the mount/sb
> flags will have SB_RDONLY/MNT_READONLY set and the correct thing
> will be done (i.e. consider code between sb_start_ro_state_change()
> and sb_end_ro_state_change() is RO).

Yes, for the RO->RW the barrier in sb_prepare_remount_readonly() indeed
pairs with the barrier in mnt_is_readonly(). It makes sure that if
mnt_is_readonly() observes s_readonly_remount == 0, it will observe
SB_RDONLY / MNT_READONLY still set.

								Honza
-- 
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux