On Mon, 2020-07-20 at 13:30 +0200, Arnd Bergmann wrote: > On Mon, Jul 20, 2020 at 11:25 AM Mike Rapoport <rppt@xxxxxxxxxx> > wrote: > > > > From: Mike Rapoport <rppt@xxxxxxxxxxxxx> > > > > Introduce "secretmemfd" system call with the ability to create > > memory areas visible only in the context of the owning process and > > not mapped not only to other processes but in the kernel page > > tables as well. > > > > The user will create a file descriptor using the secretmemfd system > > call where flags supplied as a parameter to this system call will > > define the desired protection mode for the memory associated with > > that file descriptor. Currently there are two protection modes: > > > > * exclusive - the memory area is unmapped from the kernel direct > > map and it > > is present only in the page tables of the owning mm. > > * uncached - the memory area is present only in the page tables of > > the > > owning mm and it is mapped there as uncached. > > > > For instance, the following example will create an uncached mapping > > (error handling is omitted): > > > > fd = secretmemfd(SECRETMEM_UNCACHED); > > ftruncate(fd, MAP_SIZE); > > ptr = mmap(NULL, MAP_SIZE, PROT_READ | PROT_WRITE, > > MAP_SHARED, > > fd, 0); > > > > Signed-off-by: Mike Rapoport <rppt@xxxxxxxxxxxxx> > > I wonder if this should be more closely related to dmabuf file > descriptors, which are already used for a similar purpose: sharing > access to secret memory areas that are not visible to the OS but can > be shared with hardware through device drivers that can import a > dmabuf file descriptor. I'll assume you mean the dmabuf userspace API? Because the kernel API is completely device exchange specific and wholly inappropriate for this use case. The user space API of dmabuf uses a pseudo-filesystem. So you mount the dmabuf file type (and by "you" I mean root because an ordinary user doesn't have sufficient privilege). This is basically because every dmabuf is usable by any user who has permissions. This really isn't the initial interface we want for secret memory because secret regions are supposed to be per process and not shared (at least we don't want other tenants to see who's using what). Once you have the fd, you can seek to find the size, mmap, poll and ioctl it. The ioctls are all to do with memory synchronization (as you'd expect from a device backed region) and the mmap is handled by the dma_buf_ops, which is device specific. Sizing is missing because that's reported by the device not settable by the user. What we want is the ability to get an fd, set the properties and the size and mmap it. This is pretty much a 100% overlap with the memfd API and not much overlap with the dmabuf one, which is why I don't think the interface is very well suited. James