Re: [PATCH 26/26] x86, pkeys: Documentation

Ingo Molnar <mingo@xxxxxxxxxx> · Fri, 2 Oct 2015 08:23:40 +0200

* Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:

> >> Assuming it boots up fine on a typical distro, i.e. assuming that there are no
> >> surprises where PROT_READ && PROT_EXEC sections are accessed as data.
> >
> > I can't wait to find out what implicitly expects PROT_READ from
> > PROT_EXEC mappings. :)

So what seems to happen is that there are no pure PROT_EXEC mappings in practice - 
they are only omnibus PROT_READ|PROT_EXEC mappings, an unknown proportion of which 
truly relies on PROT_READ:

  $ for C in firefox ls perf libreoffice google-chrome Xorg xterm \
      konsole; do echo; echo "# $C:"; strace -e trace=mmap -f $C -h 2>&1 | cut -d, -f3 | \
      grep PROT | sort | uniq -c; done

# firefox:
     13  PROT_READ
     82  PROT_READ|PROT_EXEC
    184  PROT_READ|PROT_WRITE
      2  PROT_READ|PROT_WRITE|PROT_EXEC

# ls:
      2  PROT_READ
      7  PROT_READ|PROT_EXEC
     17  PROT_READ|PROT_WRITE

# perf:
      1  PROT_READ
     20  PROT_READ|PROT_EXEC
     44  PROT_READ|PROT_WRITE

# libreoffice:
      2  PROT_NONE
     87  PROT_READ
    148  PROT_READ|PROT_EXEC
    339  PROT_READ|PROT_WRITE

# google-chrome:
     39  PROT_READ
    121  PROT_READ|PROT_EXEC
    345  PROT_READ|PROT_WRITE

# Xorg:
      1  PROT_READ
     22  PROT_READ|PROT_EXEC
     39  PROT_READ|PROT_WRITE

# xterm:
      1  PROT_READ
     25  PROT_READ|PROT_EXEC
     46  PROT_READ|PROT_WRITE

# konsole:
      1  PROT_READ
    101  PROT_READ|PROT_EXEC
    175  PROT_READ|PROT_WRITE

So whatever kernel side method we come up with, it's not something that I expect 
to become production quality. "Proper" conversion to pkeys has to be driven from 
the user-space side.

That does not mean we can not try! :-)

> There's one annoying issue at least:
> 
> mprotect_pkey(..., PROT_READ | PROT_EXEC, 0) sets protection key 0.
> mprotect_pkey(..., PROT_EXEC, 0) maybe sets protection key 15 or
> whatever we use for this.  What does mprotect_pkey(..., PROT_EXEC, 0)
> do?  What if the caller actually wants key 0?  What if some CPU vendor
> some day implements --x for real?

That comes from the hardcoded "user-space has 4 bits to itself, not managed by the 
kernel" assumption in the whole design. So no layering between different 
user-space libraries using pkeys in a different fashion, no transparent kernel use 
of pkeys (such as it may be), etc.

I'm not sure it's _worth_ managing these 4 bits, but '16 separate keys' does seem 
to be to me above a certain resource threshold that should be more explicitly 
managed than telling user-space: "it's all yours!".

> Also, how do we do mprotect_pkey and say "don't change the key"?

So if we start managing keys as a resource (i.e. alloc/free up to 16 of them), and 
provide APIs for user-space to do all that, then user-space is not supposed to 
touch keys it has not allocated for itself - just like it's not supposed to write 
to fds it has not opened.

Such an allocation method can still 'mess up', and if the kernel allocates a key 
for its purposes it should not assume that user-space cannot change it, but at 
least for non-buggy code there's no interaction and it would work out fine.

Thanks,

	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>