Re: [PATCH v8 0/4] Introduce mseal

Jeff Xu <jeffxu@xxxxxxxxxx> · Thu, 1 Feb 2024 19:20:08 -0800

On Thu, Feb 1, 2024 at 3:15 PM Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> On Thu, 1 Feb 2024 at 14:54, Theo de Raadt <deraadt@xxxxxxxxxxx> wrote:
> >
> > Linus, you are in for a shock when the proposal doesn't work for glibc
> > and all the applications!
>
> Heh. I've enjoyed seeing your argumentative style that made you so
> famous back in the days. Maybe it's always been there, but I haven't
> seen the BSD people in so long that I'd forgotten all about it.
>
> That said, famously argumentative or not, I think Theo is right, and I
> do think the MAP_SEALABLE bit is nonsensical.
>
> If somebody wants to mseal() a memory region, why would they need to
> express that ahead of time?
>
I like to look at things from the point of view of average Linux
userspace developers,  they might not have the same level of expertise
as the other folks on this email list or they might not have time and
mileage for those details.

To me, the most important thing is to deliver a feature that's easy to
use and works well. I don't want users to mess things up, so if I'm
the one giving them the tools, I'm going to make sure they have all
the information they need and that there are safeguards in place.

e.g. considering the following user case:
1> a security sensitive data is allocated from heap, using malloc,
from the software component A, and filled with information.
2> software component B then uses mprotect to change it to RO, and
seal it using mseal().

Yes. we could choose to allow it. But there are complications:

1> Is this the right pattern ? why don't component A already seal it
if they think it is important ?
2> Why  heap, why not mmap() a new memory mapping for that security data ?
3>  free() will not respect the situation of whether the memory is
sealed or not. How would a new developer know they probably shall
never free the sealed memory ?
4>  brk-shrink will never be able to pass the VMA that gets splited
out by mseal(), there are memory footprint implications to the
process.
5>  what if the security sensitive data happens to be the first VMA or
last VMA of the heap, will sealing the first VMA/last VMA cause any
issue there ? since they might carry important VMA flags ? ( I don't
know enough about brk.)
6> If we ever support sealing the heap for its entirety (make it not
executable), and still want to support other brk behaviors, such as
shrink/grow, would that conflict with current mseal(), if we allow it
on heap from beginning ?

Questions like that, without clear answers, to me it is premature to
already let developers start using  mseal() for heap.

And even if we have all the answers for heap, how about stack, or
other types of virtual memory ?

Again, I don't have enough knowledge to get a complete list that
shouldn't be sealed,  the input from Theo is none should I worry
about.  However it  is clearly not none to me, besides heap mentioned,
there is also aio/shm.

So MAP_SEALABLE is a conservative approach to limit the scope to ***
two known use cases *** that  I want to work on (libc and chrome) and
give  time needed to answer those questions. It is like a claim: only
those marked by MAP_SEALABLE support the sealing at this point of
time.

And MAP_SEALABLE is reversible, e.g. a sysctl could be added to make
all memory sealable in the future, or we could obsoleted it entirely
when time comes, an application that already passes MAP_SEALABLE can
be treated as noop. However, if all memory were allowed to be sealable
from the beginning, reversing that decision would be hard.

After those considerations, if MAP_SEALABLE is still not preferred by
you. Then I have the following options for you to choose:

1. MAP_NOT_SEALABLE in the mmap(). And I will use them for the
heap/aio/shm case.
This basically says Linux does not officially support sealing on
those,  until we support them, we discourage the sealing on those
mappings.

2. make MAP_NOT_SEALABLE only a kernel visible flag. So application
space won't be able to use it.

3. open for all, and list as much as details in the documentation.
 If we choose this route, I would like to have more discussion on the
heap/stack, at least the Linux developers will learn from those
discussions.

> So the part I think is sane is the mseal() system call itself, in that
> it allows *potential* future expansion of the semantics.
>
> But hopefully said future expansion isn't even needed, and all users
> want the base experience, which is why I think PROT_SEAL (both to mmap
> and to mprotect) makes sense as an alternative form.
>
> So yes, to my mind
>
>     mprotect(addr, len, PROT_READ);
>     mseal(addr, len, 0);
>
> should basically give identical results to
>
>     mprotect(addr, len, PROT_READ | PROT_SEAL);
>
> and using PROT_SEAL at mmap() time is similarly the same obvious
> notion of "map this, and then seal that mapping".
>
> The reason for having "mseal()" as a separate call at all from the
> PROT_SEAL bit is that it does allow possible future expansion (while
> PROT_SEAL is just a single bit, and it won't change semantics) but
> also so that you can do whatever prep-work in stages if you want to,
> and then just go "now we seal it all".
>

To clarify: do you mean to have the following ?

mmap(PROT_READ|PROT_SEAL)
mseal(addr,len,0)
mprotect(addr,len,PROT_READ|PROT_SEAL) ?

I have to think about the mprotect() case.

For mmap(PROT_READ|PROT_SEAL),  I might  have a use case already:

fs/binfmt_elf.c
if (current->personality & MMAP_PAGE_ZERO) {
                /* Why this, you ask???  Well SVr4 maps page 0 as read-only,
                   and some applications "depend" upon this behavior.
                   Since we do not have the power to recompile these, we
                   emulate the SVr4 behavior. Sigh. */

                error = vm_mmap(NULL, 0, PAGE_SIZE,
                                PROT_READ | PROT_EXEC,   <-- add PROT_SEAL
                                MAP_FIXED | MAP_PRIVATE, 0);
        }

I don't see the benefit of RWX page 0, which might make a null
pointers error to become executable for some code.

Best Regards,
-Jeff

>           Linus