Re: [RFC PATCH v2 0/5] mm: extend memfd with ability to create "secret" memory areas

James Bottomley <jejb@xxxxxxxxxxxxx> · Fri, 17 Jul 2020 07:43:51 -0700

On Fri, 2020-07-17 at 10:36 +0200, Pavel Machek wrote:
> Hi!
> 
> > This is a second version of "secret" mappings implementation backed
> > by a file descriptor. 
> > 
> > The file descriptor is created using memfd_create() syscall with a
> > new MFD_SECRET flag. The file descriptor should be configured using
> > ioctl() to define the desired protection and then mmap() of the fd
> > will create a "secret" memory mapping. The pages in that mapping
> > will be marked as not present in the direct map and will have
> > desired protection bits set in the user page table. For instance,
> > current implementation allows uncached mappings.
> > 
> > Hiding secret memory mappings behind an anonymous file allows
> > (ab)use of the page cache for tracking pages allocated for the
> > "secret" mappings as well as using address_space_operations for
> > e.g. page migration callbacks.
> > 
> > The anonymous file may be also used implicitly, like hugetlb files,
> > to implement mmap(MAP_SECRET) and use the secret memory areas with
> > "native" mm ABIs.
> 
> I believe unix userspace normally requires mappings to be... well...
> protected from other users. How is this "secret" thing different? How
> do you explain the difference to userland programmers?

That's true in the normal case, but for the container cloud the threat
model we're considering is a hostile other tenant trying to trick the
kernel into giving them access to your mappings.  In the FOSDEM talk we
did about this:

https://fosdem.org/2020/schedule/event/kernel_address_space_isolation/

We demonstrated the case where the hostile tenant obtained host root
and then tried to get access via ptrace.  The point being that pushing
the pages out of the direct map means that even root can't get access
to the secret by any means the OS provides.  If you want to play with
this yourself, we have a userspace library:

https://git.kernel.org/pub/scm/linux/kernel/git/jejb/secret-memory-preloader.git/

It does two things: the first is act as a preloader for openssl to
redirect all the OPENSSL_malloc calls to secret memory meaning any
secret keys get automatically protected this way and the other thing it
does is expose the API to the user who needs it.  I anticipate that a
lot of the use cases would be like the openssl one: many toolkits that
deal with secret keys already have special handling for the memory to
try to give them greater protection, so this would simply be pluggable
into the toolkits without any need for user application modification.

James