Re: [RFC PATCH v5 0/7] mseal system mappings

Lorenzo Stoakes <lorenzo.stoakes@xxxxxxxxxx> · Wed, 12 Feb 2025 14:01:42 +0000

(sorry I really am struggling to reply to mail as lore still seems to be
broken).

On Wed, Feb 12, 2025 at 12:37:50PM +0000, Pedro Falcato wrote:
> On Wed, Feb 12, 2025 at 11:25 AM Lorenzo Stoakes
> <lorenzo.stoakes@xxxxxxxxxx> wrote:
> >
> > On Wed, Feb 12, 2025 at 03:21:48AM +0000, jeffxu@xxxxxxxxxxxx wrote:
> > > From: Jeff Xu <jeffxu@xxxxxxxxxxxx>
> > >
> > > The commit message in the first patch contains the full description of
> > > this series.
> >
> > Sorry to nit, but it'd be useful to reproduce in the cover letter too! But
> > this obviously isn't urgent, just be nice when we un-RFC.
> >
> > Thanks for sending as RFC, appreciated, keen to figure out a way forward
> > with this series and this gives us space to discuss.
> >
> > One thing that came up recently with the LWN article (...!) was that rr is
> > also impacted by this [0].
> >
> > I think with this behind a config flag we're fine (this refers to my
> > 'opt-in' comment in the reply on LWN) as my concerns about this being
> > enabled in a broken way without an explicit kernel configuration are
> > addressed, and actually we do expose a means by which a user can detect if
> > the VDSO for instance is sealed via /proc/$pid/[s]maps.
> >
> > So tools like rr and such can be updated to check for this. I wonder if we
> > ought to try to liaise with the known problematic ones?
> >
> > It'd be nice to update the documentation to have a list of 'known
> > problematic userland software with sealed VDSO' so we make people aware.
> >
> > Hopefully we are acheiving the opt-in nature of the thing here, but it
> > makes me wonder whether we need a prctl() interface to optionally disable
> > even if the system has it enabled as a whole.
>
> Just noting that (as we discussed off-list) doing prctl() would not
> work, because that would effectively be an munseal for those vdso
> regions.
> Possibly something like a personality() flag (that's *not* inherited
> when AT_SECURE/secureexec). But personalities have other issues...

Thanks, yeah that's a good point, it would have to be implemented as a
personality or something similar otherwise you're essentially relying on
'unsealing' which can't be permitted.

I'm not sure how useful that'd be for the likes of rr though. But I suppose
if it makes everything exec'd by a child inherit it then maybe that works
for a debugging session etc.?

>
> FWIW, although it would (at the moment) be hard to pull off in the
> libc, I still much prefer it to playing these weird games with CONFIG
> options and kernel command line options and prctl and personality and
> whatnot. It seems to me like we're trying to stick policy where it
> doesn't belong.

The problem is, as a security feature, you don't want to make it trivially
easy to disable.

I mean we _need_ a config option to be able to strictly enforce only making
the feature enable-able on architectures and configuration option
combinations that work.

But if there is userspace that will be broken, we really have to have some
way of avoiding the disconnect between somebody making policy decision at
the kernel level and somebody trying to run something.

Because I can easily envision somebody enabling this as a 'good security
feature' for a distro release or such, only for somebody else to later try
rr, CRIU, or whatever else and for it to just not work or fail subtly and
to have no idea why.

I mean one option is to have it as a CONFIG_ flag _and_ you have to enable
it via a tunable, so then it can become sysctl.d policy for instance.

The CONFIG_ flag dependency is critical because we don't want to enable
this on arches that have not been tested against it.

It's vital at any rate that we document everywhere we can that _this might
break some userland that depends on remapping the VDSO_.

>
> --
> Pedro