Re: [PATCH] statmount: add a new supported_mask field

Christian Brauner <brauner@xxxxxxxxxx> · Tue, 4 Feb 2025 17:45:33 +0100

On Tue, Feb 04, 2025 at 10:57:39AM -0500, Jeff Layton wrote:
> On Tue, 2025-02-04 at 16:09 +0100, Christian Brauner wrote:
> > On Tue, Feb 04, 2025 at 07:28:20AM -0500, Jeff Layton wrote:
> > > On Tue, 2025-02-04 at 12:07 +0100, Christian Brauner wrote:
> > > > On Mon, Feb 03, 2025 at 12:09:48PM -0500, Jeff Layton wrote:
> > > > > Some of the fields in the statmount() reply can be optional. If the
> > > > > kernel has nothing to emit in that field, then it doesn't set the flag
> > > > > in the reply. This presents a problem: There is currently no way to
> > > > > know what mask flags the kernel supports since you can't always count on
> > > > > them being in the reply.
> > > > > 
> > > > > Add a new STATMOUNT_SUPPORTED_MASK flag and field that the kernel can
> > > > > set in the reply. Userland can use this to determine if the fields it
> > > > > requires from the kernel are supported. This also gives us a way to
> > > > > deprecate fields in the future, if that should become necessary.
> > > > > 
> > > > > Signed-off-by: Jeff Layton <jlayton@xxxxxxxxxx>
> > > > > ---
> > > > > I ran into this problem recently. We have a variety of kernels running
> > > > > that have varying levels of support of statmount(), and I need to be
> > > > > able to fall back to /proc scraping if support for everything isn't
> > > > > present. This is difficult currently since statmount() doesn't set the
> > > > > flag in the return mask if the field is empty.
> > > > > ---
> > > > >  fs/namespace.c             | 18 ++++++++++++++++++
> > > > >  include/uapi/linux/mount.h |  4 +++-
> > > > >  2 files changed, 21 insertions(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/fs/namespace.c b/fs/namespace.c
> > > > > index a3ed3f2980cbae6238cda09874e2dac146080eb6..7ec5fc436c4ff300507c4ed71a757f5d75a4d520 100644
> > > > > --- a/fs/namespace.c
> > > > > +++ b/fs/namespace.c
> > > > > @@ -5317,6 +5317,21 @@ static int grab_requested_root(struct mnt_namespace *ns, struct path *root)
> > > > >  	return 0;
> > > > >  }
> > > > >  
> > > > > +/* This must be updated whenever a new flag is added */
> > > > > +#define STATMOUNT_SUPPORTED (STATMOUNT_SB_BASIC | \
> > > > > +			     STATMOUNT_MNT_BASIC | \
> > > > > +			     STATMOUNT_PROPAGATE_FROM | \
> > > > > +			     STATMOUNT_MNT_ROOT | \
> > > > > +			     STATMOUNT_MNT_POINT | \
> > > > > +			     STATMOUNT_FS_TYPE | \
> > > > > +			     STATMOUNT_MNT_NS_ID | \
> > > > > +			     STATMOUNT_MNT_OPTS | \
> > > > > +			     STATMOUNT_FS_SUBTYPE | \
> > > > > +			     STATMOUNT_SB_SOURCE | \
> > > > > +			     STATMOUNT_OPT_ARRAY | \
> > > > > +			     STATMOUNT_OPT_SEC_ARRAY | \
> > > > > +			     STATMOUNT_SUPPORTED_MASK)
> > > > 
> > > > Hm, do we need a separate bit for STATMOUNT_SUPPORTED_MASK? Afaiu, this
> > > > is more of a convenience thing but then maybe we just do:
> > > > 
> > > > #define STATMOUNT_SUPPORTED_MASK STATMOUNT_MNT_BASIC
> > > > 
> > > > and be done with it?
> > > > 
> > > > Otherwise I think it is worth having support for this.
> > > > 
> > > 
> > > Are you suggesting that we should just add the ->supported_mask field
> > > without a declaring a bit for it? If so, how would that work on old
> > > kernels? You'd never know if you could trust the contents of that field
> > > since the return mask wouldn't indicate any difference.
> > 
> > What I didn't realize because I hadn't read carefully enough in your
> > patch was that STATMOUNT_SUPPORTED_MASK is raised in ->mask and only
> > then is ->supported_mask filled in.
> > 
> > My thinking had been that ->supported_mask will simply always be filled
> > in by the kernel going forward. Which is arguably not ideal but would
> > work:
> > 
> > So the kernel guarantees since the introduction of statmount() that when
> > we copy out to userspace that anything the kernel doesn't know will be
> > copied back zeroed. So any unknown fields are zero.
> > 
> > (1) Say userspace passes a struct statmount with ->supported_mask to the
> >     kernel - even if it has put garbage in there or intentionally raised
> >     valid flags in there - the old kernel will copy over this and set it
> >     to zero.
> > 
> > (2) If you're on a new kernel but pass an old struct the kernel will
> >     fill in ->supported_mask. Imho, that's fine. Userspace simply will
> >     not know about it.
> > 
> > But we can leave the explicit request in!
> > 
> 
> 
> I can respin without STATMOUNT_SUPPORTED_MASK. I was thinking it left
> that part of the return buffer untouched, but if it's zeroed, then that
> works as well.
> 
> If you see a supported_mask of 0, you know the kernel didn't fill it in
> (since it should at least support _something_). That'll need to be
> carefully documented though.

It's probably easier for userspace if that flag must be specifically raised.