Re: [PATCH 21/29] ovl: implement get acl method

Christian Brauner <brauner@xxxxxxxxxx> · Fri, 23 Sep 2022 17:07:55 +0200

On Fri, Sep 23, 2022 at 04:59:42PM +0200, Miklos Szeredi wrote:
> On Thu, 22 Sept 2022 at 17:18, Christian Brauner <brauner@xxxxxxxxxx> wrote:
> >
> > The current way of setting and getting posix acls through the generic
> > xattr interface is error prone and type unsafe. The vfs needs to
> > interpret and fixup posix acls before storing or reporting it to
> > userspace. Various hacks exist to make this work. The code is hard to
> > understand and difficult to maintain in it's current form. Instead of
> > making this work by hacking posix acls through xattr handlers we are
> > building a dedicated posix acl api around the get and set inode
> > operations. This removes a lot of hackiness and makes the codepaths
> > easier to maintain. A lot of background can be found in [1].
> >
> > In order to build a type safe posix api around get and set acl we need
> > all filesystem to implement get and set acl.
> >
> > Now that we have added get and set acl inode operations that allow easy
> > access to the dentry we give overlayfs it's own get and set acl inode
> > operations.
> >
> > Since overlayfs is a stacking filesystem it will use the newly added
> > posix acl api when retrieving posix acls from the relevant layer.
> >
> > Since overlayfs can also be mounted on top of idmapped layers. If
> > idmapped layers are used overlayfs must take the layer's idmapping into
> > account after it retrieved the posix acls from the relevant layer.
> >
> > Note, until the vfs has been switched to the new posix acl api this
> > patch is a non-functional change.
> >
> > Link: https://lore.kernel.org/all/20220801145520.1532837-1-brauner@xxxxxxxxxx [1]
> > Signed-off-by: Christian Brauner (Microsoft) <brauner@xxxxxxxxxx>
> > ---
> >  fs/overlayfs/dir.c       |  3 +-
> >  fs/overlayfs/inode.c     | 63 ++++++++++++++++++++++++++++++++++++----
> >  fs/overlayfs/overlayfs.h | 10 +++++--
> >  3 files changed, 67 insertions(+), 9 deletions(-)
> >
> > diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
> > index 7bece7010c00..eb49d5d7b56f 100644
> > --- a/fs/overlayfs/dir.c
> > +++ b/fs/overlayfs/dir.c
> > @@ -1311,7 +1311,8 @@ const struct inode_operations ovl_dir_inode_operations = {
> >         .permission     = ovl_permission,
> >         .getattr        = ovl_getattr,
> >         .listxattr      = ovl_listxattr,
> > -       .get_inode_acl  = ovl_get_acl,
> > +       .get_inode_acl  = ovl_get_inode_acl,
> > +       .get_acl        = ovl_get_acl,
> >         .update_time    = ovl_update_time,
> >         .fileattr_get   = ovl_fileattr_get,
> >         .fileattr_set   = ovl_fileattr_set,
> > diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
> > index ecb51c249466..dd11e13cd288 100644
> > --- a/fs/overlayfs/inode.c
> > +++ b/fs/overlayfs/inode.c
> > @@ -14,6 +14,8 @@
> >  #include <linux/fileattr.h>
> >  #include <linux/security.h>
> >  #include <linux/namei.h>
> > +#include <linux/posix_acl.h>
> > +#include <linux/posix_acl_xattr.h>
> >  #include "overlayfs.h"
> >
> >
> > @@ -460,9 +462,9 @@ ssize_t ovl_listxattr(struct dentry *dentry, char *list, size_t size)
> >   * of the POSIX ACLs retrieved from the lower layer to this function to not
> >   * alter the POSIX ACLs for the underlying filesystem.
> >   */
> > -static void ovl_idmap_posix_acl(struct inode *realinode,
> > -                               struct user_namespace *mnt_userns,
> > -                               struct posix_acl *acl)
> > +void ovl_idmap_posix_acl(struct inode *realinode,
> > +                        struct user_namespace *mnt_userns,
> > +                        struct posix_acl *acl)
> >  {
> >         struct user_namespace *fs_userns = i_user_ns(realinode);
> >
> > @@ -495,7 +497,7 @@ static void ovl_idmap_posix_acl(struct inode *realinode,
> >   *
> >   * This is obviously only relevant when idmapped layers are used.
> >   */
> > -struct posix_acl *ovl_get_acl(struct inode *inode, int type, bool rcu)
> > +struct posix_acl *ovl_get_inode_acl(struct inode *inode, int type, bool rcu)
> >  {
> >         struct inode *realinode = ovl_inode_real(inode);
> >         struct posix_acl *acl, *clone;
> > @@ -547,6 +549,53 @@ struct posix_acl *ovl_get_acl(struct inode *inode, int type, bool rcu)
> >         posix_acl_release(acl);
> >         return clone;
> >  }
> > +
> > +static struct posix_acl *ovl_get_acl_path(const struct path *path,
> > +                                         const char *acl_name)
> > +{
> > +       struct posix_acl *real_acl, *clone;
> > +       struct user_namespace *mnt_userns;
> > +
> > +       mnt_userns = mnt_user_ns(path->mnt);
> > +
> > +       real_acl = vfs_get_acl(mnt_userns, path->dentry, acl_name);
> > +       if (IS_ERR(real_acl))
> > +               return real_acl;
> > +       if (!real_acl)
> > +               return NULL;
> 
> if (IS_ERR_OR_NULL(real_acl))
>     return real_acl;

Thanks.

> 
> > +
> > +       if (!is_idmapped_mnt(path->mnt))
> > +               return real_acl;
> > +
> > +       /*
> > +        * We cannot alter the ACLs returned from the relevant layer as that
> > +        * would alter the cached values filesystem wide for the lower
> > +        * filesystem. Instead we can clone the ACLs and then apply the
> > +        * relevant idmapping of the layer.
> > +        */
> 
> Can't vfs_get_acl() return 'const posix_acl *' to enforce that?

The problem is that struct posix_acl is reference counted and often has
to be passed to functions such as posix_acl_release() or
posix_acl_dup().