Re: [PATCH] ovl: rely on SB_I_NOUMASK

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Oct 13, 2023 at 10:56 AM Christian Brauner <brauner@xxxxxxxxxx> wrote:
>
> On Fri, Oct 13, 2023 at 09:57:50AM +0300, Amir Goldstein wrote:
> > On Thu, Oct 12, 2023 at 6:37 PM Christian Brauner <brauner@xxxxxxxxxx> wrote:
> > >
> > > In commit f61b9bb3f838 ("fs: add a new SB_I_NOUMASK flag") we added a
> > > new SB_I_NOUMASK flag that is used by filesystems like NFS to indicate
> > > that umask stripping is never supposed to be done in the vfs independent
> > > of whether or not POSIX ACLs are supported.
> > >
> > > Overlayfs falls into the same category as it raises SB_POSIXACL
> > > unconditionally to defer umask application to the upper filesystem.
> > >
> > > Now that we have SB_I_NOUMASK use that and make SB_POSIXACL properly
> > > conditional on whether or not the kernel does have support for it. This
> > > will enable use to turn IS_POSIXACL() into nop on kernels that don't
> > > have POSIX ACL support avoding bugs from missed umask stripping.
> > >
> > > Signed-off-by: Christian Brauner <brauner@xxxxxxxxxx>
> > > ---
> > > Hey Amir & Miklos,
> > >
> > > This depends on the aforementioned patch in vfs.misc. So if you're fine
> > > with this change I'd take this through vfs.misc.
> >
> > Generally, I'm fine with a version of this going through the vfs tree.
> >
> > >
> > > Christian
> > > ---
> > >  fs/overlayfs/super.c | 8 ++++++++
> > >  1 file changed, 8 insertions(+)
> > >
> > > diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> > > index 9f43f0d303ad..361189b676b0 100644
> > > --- a/fs/overlayfs/super.c
> > > +++ b/fs/overlayfs/super.c
> > > @@ -1489,8 +1489,16 @@ int ovl_fill_super(struct super_block *sb, struct fs_context *fc)
> > >         sb->s_xattr = ofs->config.userxattr ? ovl_user_xattr_handlers :
> > >                 ovl_trusted_xattr_handlers;
> > >         sb->s_fs_info = ofs;
> > > +#ifdef CONFIG_FS_POSIX_ACL
> > >         sb->s_flags |= SB_POSIXACL;
> > > +#endif
> >
> > IDGI, if IS_POSIXACL() is going to turn into noop
> > why do we need this ifdef?
>
> Because it's wrong to advertise SB_POSIXACL support if the kernel
> doesn't support it at all. It's a minor correctness thing. I'll provide
> more details below.
>
> >
> > To me the flag SB_POSIXACL means that any given inode
> > MAY have a custom posix acl.
> >
> > >         sb->s_iflags |= SB_I_SKIP_SYNC | SB_I_IMA_UNVERIFIABLE_SIGNATURE;
> > > +       /*
> > > +        * Ensure that umask handling is done by the filesystems used
> > > +        * for the the upper layer instead of overlayfs as that would
> > > +        * lead to unexpected results.
> > > +        */
> > > +       sb->s_iflags |= SB_I_NOUMASK;
> > >
> >
> > Looks like FUSE has a similar pattern, although the testing and then
> > setting of SB_POSIXACL is perplexing to me:
> >
> >         /* Handle umasking inside the fuse code */
> >         if (sb->s_flags & SB_POSIXACL)
> >                 fc->dont_mask = 1;
> >         sb->s_flags |= SB_POSIXACL;
> >
> > And for NFS, why was SB_I_NOUMASK set for v4 and not for v3
> > when v3 clearly states:
> >
> >         case 3:
> >                 /*
> >                  * The VFS shouldn't apply the umask to mode bits.
> >                  * We will do so ourselves when necessary.
> >                  */
> >                 sb->s_flags |= SB_POSIXACL;
> >
> > Feels like I am missing parts of the big picture?
>
> When a filesystem doesn't raise SB_POSIXACL the vfs will strip the
> umask. When a filesystem does raise SB_POSIXACL the vfs will not strip
> the umask. Instead, the umask is stripped during posix_acl_create() in
> the individual filesystems.
>
> If the kernel doesn't support POSIX ACLs, i.e. CONFIG_FS_POSIX_ACL isn't
> set then posix_acl_create() is a nop. Hence, on a kernel without
> CONFIG_FS_POSIX_ACL any filesystem that raises SB_POSIXACL will not do
> any umask stripping at all: not on the vfs level because the vfs sees
> SB_POSIXACL and not on the filesystem level because posix_acl_create()
> is a nop.
>
> This complication in umask handling is a side-effect of POSIX ACLs and
> one that is really nasty. But the general idea is that you can't get one
> without the other. IOW, obviously no one should raise SB_POSIXACL if
> they don't intend to support POSIX ACLs...
>
> But as we all know, hackers like us gotta hack. So filesystems like NFS
> do use SB_POSIXACL for exactly that. But life without nuanced
> complications isn't worth living so NFS v3 and NFS v4 do it slightly
> differently.
>
> NFS v3 uses SB_POSIXACL to avoid umask stripping in the VFS but NFS v3
> does optionally support POSIX ACLs
>
> #ifdef CONFIG_NFS_V3_ACL
>         .listxattr      = nfs3_listxattr,
>         .get_inode_acl  = nfs3_get_acl,
>         .set_acl        = nfs3_set_acl,
> #endif
>
> and so raising SB_POSIXACL makes sense in that case. But in all
> likelihood NFS should conditionalize SB_POSIXACL for NFS v3 on
> CONFIG_NFS_V3_ACL or - if they want to prevent umask stripping
> completely - they also need to raise SB_I_NOUMASK. But I have zero idea
> what's correct for them so I trust Jeff and other people interested in
> NFS to figure out what they need.
>
> NFS v4 on the other hand clearly doesn't care about POSIX ACL support at
> all because they don't support it in any form. So NFS v4 is only
> interested in the side-effects of POSIX ACLs on umask handling.
>
> So for them SB_I_NOUMASK is clearly the right thing to do.
>
> My reasoning about overlayfs is that it falls into the same category as
> NFS v4 albeit for slightly different reasons. As a stacking filesystems
> with a writable upper layer overlayfs can never rely on umask handling
> done in the VFS because it doesn't (easily) know up front whether the
> underlying filesystems supports POSIX ACLs or not.
>
> And currently it always has to raise SB_POSIXACL to get the side-effects
> on umask handling even if the kernel isn't compiled with POSIX ACL
> support at all. IOW, you always want the upper layer to do the umask
> handling.
>
> So the correct thing to do is to raise SB_I_NOUMASK to communicate
> clearly that umask handling will always be done by the upper layer. And
> that in turn allows you to stop raising SB_POSIXACL unconditionally.
>
> Because in fs/overlayfs/{inode.c,overlayfs.h} you can also see that all
> ovl inode operations are nop-ed when CONFIG_FS_POSIX_ACL is turned off.
>

All right.

Acked-by: Amir Goldstein <amir73il@xxxxxxxxx>

Thanks for the explanation,
Amir.




[Index of Archives]     [Linux Filesystems Devel]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux