Re: [PATCH v4 1/1] exec: seal system mappings

Benjamin Berg <benjamin@xxxxxxxxxxxxxxxx> · Thu, 16 Jan 2025 18:01:47 +0100

Hi Lorenzo,

On Thu, 2025-01-16 at 15:48 +0000, Lorenzo Stoakes wrote:
> On Wed, Jan 15, 2025 at 12:20:59PM -0800, Jeff Xu wrote:
> > On Wed, Jan 15, 2025 at 11:46 AM Lorenzo Stoakes
> > <lorenzo.stoakes@xxxxxxxxxx> wrote:
> 
> [SNIP]
> > 
> > > I've made it abundantly clear that this (NACKed) series cannot allow the
> > > kernel to be in a broken state even if a user sets flags to do so.
> > > 
> > > This is because users might lack context to make this decision and
> > > incorrectly do so, and now we ship a known-broken kernel.
> > > 
> > > You are now suggesting disabling the !CRIU requirement. Which violates my
> > > _requirements_ (not optional features).
> > > 
> > Sure, I can add CRIU back.
> > 
> > Are you fine with UML and gViso not working under this CONFIG ?
> > UML/gViso doesn't use any KCONFIG like CRIU does.
> 
> Yeah this is a concern, wouldn't we be able to catch UML with a flag?
> 
> Apologies my fault for maybe not being totally up to date with this, but what
> exactly was the gViso (is it gVisor actually?)

UML is a separate architecture. It is a Linux kernel running as a
userspace application on top of an unmodified host kernel.

So really, UML is a mostly weird userspace program for the purpose of
this discussion. And a pretty buggy one too--it got broken by rseq
already.

What UML now does is:
 * Execute a tiny static binary
 * map special "stub" code/data pages at the topmost userspace address
   (replacing its stack)
 * continue execution inside the "stub" pages
 * unmap everything below the "stub" pages
 * use the unmap'ed area for userspace application mappings

I believe that the "unmap everything" step will fail with this feature.

Now, I am sure one can come up with solutions, e.g.:
   1. Simply print an explanation if the unmap() fails
   2. Find an address that is guaranteed to be below the VDSO and use a
      smaller address space for the UML userspace.
   3. Somehow tell the host kernel to not install the VDSO mappings
   4. Add the host VDSO pages as a sealed VMA within UML to guard them

UML is a bit of a niche and I am not sure it is worth worrying about it
too much.

Benjamin

> 
> > 
> > > You seem to be saying you're pushing an internal feature on upstream and
> > > only care about internal use cases, this is not how upstream works, as
> > > Matthew alludes to.
> > > 
> > > I have told you that my requirements are:
> > > 
> > > 1. You cannot allow a user to set config or boot options to have a
> > >    broken kernel configuration.
> > > 
> > Can you clarify on the definition of "broken kernel configuration":
> 
> Anything that'd unexpected break userland in a way that would be entirely
> unexpected.
> 
> Especially so if there is a real disconnect between the person who is
> enabling the feature and the program.
> 
> For instance if a distro wants to be big on security, is (as is entirely
> reasonable) concerned about an unsealed VDSO/VVAR/etc. being exploited, so
> turns on the flag, but _doesn't realise_ or doesn't communicate (such a big
> problem and difficult actually for many distros/vendors) that this will
> break certain programs - and then users do a kernel update, and *bang*
> their whole system is broken.
> 
> It's really this kind of scenario I'm worried about.
> 
> This is the crux of it really.
> 
> > 
> > Do you consider "setting mseal kernel cmd line under 32 bit build" as broken ?
> > If so, this problem is not solvable and I might just not try to solve
> > it for the next version.
> 
> Yeah, I really don't like the kernel cmd line thing, because of this risk
> of disconnect - your justification for it is prima facie reasonable - the
> distro didn't want to enable the thing by default but you want more
> security - but then we have this issue with the possible disconnect between
> 'hey here is security feature X' vs. 'security feature X breaks Y, Z +
> alpha'.
> 
> > 
> > If you just refer to a need to detect CRIU, in KCONFIG or/and kernel
> > cmd line,  this is solvable.
> > 
> > > 2. You must provide evidence that the arches you claim work with this,
> > >    actually do.
> > > 
> > Sure
> 
> See my reply to Kees as to what this comprises, sorry if I was not clear
> previously.
> 
> 
> > 
> > > You seem to have eliminated that from your summary as if the very thing
> > > that makes this series NACKed were not pertinent.
> > > 
> > In my last email, I tried to cover all code-logic related comments,
> > which is blocking me.
> > I also mentioned I will address non-code related comments
> > (threat-model/test etc),  later.
> 
> Ack.
> 
> I felt that you hadn't hit on my fundamental objections and this was in
> effect - a final analysis as to how you would be moving forward with v5 -
> but apologies if you did intend to separately discuss them.
> 
> > 
> > > if you do not address these correctly, I will simply have to reject your v5
> > > too and it'll waste everybody's time. I _genuinely_ don't want to have to
> > > do this.
> > > 
> > > Any solution MUST fulfil these requirements. I also want to see v5 as an
> > > RFC honestly at this stage, since it seems we are VERY MUCH in a discussion
> > > phase rather than a patch phase at this time.
> > > 
> > Sure.
> 
> To be clear - if the series is viable, I want to see it merged. And to
> further clarify - a simpler, smaller version of this that explicitly
> disallows breakage in config options suffices (though we must clarify the
> gVisor + UML things).
> 
> If I just wanted to reject this outright, I'd tell you :) (I don't).
> 
> I just need to feel vaguely less anxious about breaking things! :)
> 
> > 
> > > I really want to help you improve mseal and get things upstream, but I
> > > can't ignore my duty to ensure that the kernel remains stable and we don't
> > > hand kernel users (overly huge) footguns. I hate to be negative, but this
> > > is why I am pushing back so much here.
> > > 
> > Thanks. You can help me by answering my questions, and clarify your
> > requirements. I appreciate your time to make this feature useful.
> 
> Sure, hopefully I have done so, do follow up if anything was unclear.
> 
> > 
> > Please take note that the security feature often takes away
> > capabilities.  Sometimes it is impossible to meet security, usability
> > or performance goals simultaneously. I'm trying my best to get all
> > aspected satisfied.
> 
> Ack, and I realise it's often a difficult trade-off. I just worry about
> compounding complexity in consequences of kernel configuration vs. userland
> stuff + the disconnect between the two.
> 
> > 
> > -Jeff
> > 
> > > Thanks!
> 
> Cheers, Lorenzo
>