On Mon 08-02-21 23:26:05, Mike Rapoport wrote: > On Mon, Feb 08, 2021 at 11:49:22AM +0100, Michal Hocko wrote: > > On Mon 08-02-21 10:49:17, Mike Rapoport wrote: [...] > > > The file descriptor based memory has several advantages over the > > > "traditional" mm interfaces, such as mlock(), mprotect(), madvise(). It > > > paves the way for VMMs to remove the secret memory range from the process; > > > > I do not understand how it helps to remove the memory from the process > > as the interface explicitly allows to add a memory that is removed from > > all other processes via direct map. > > The current implementation does not help to remove the memory from the > process, but using fd-backed memory seems a better interface to remove > guest memory from host mappings than mmap. As Andy nicely put it: > > "Getting fd-backed memory into a guest will take some possibly major work in > the kernel, but getting vma-backed memory into a guest without mapping it > in the host user address space seems much, much worse." OK, so IIUC this means that the model is to hand over memory from host to guest. I thought the guest would be under control of its address space and therefore it operates on the VMAs. This would benefit from an additional and more specific clarification. > > > As secret memory implementation is not an extension of tmpfs or hugetlbfs, > > > usage of a dedicated system call rather than hooking new functionality into > > > memfd_create(2) emphasises that memfd_secret(2) has different semantics and > > > allows better upwards compatibility. > > > > What is this supposed to mean? What are differences? > > Well, the phrasing could be better indeed. That supposed to mean that > they differ in the semantics behind the file descriptor: memfd_create > implements sealing for shmem and hugetlbfs while memfd_secret implements > memory hidden from the kernel. Right but why memfd_create model is not sufficient for the usecase? Please note that I am arguing against. To be honest I do not really care much. Using an existing scheme is usually preferable from my POV but there might be real reasons why shmem as a backing "storage" is not appropriate. > > > The secretmem mappings are locked in memory so they cannot exceed > > > RLIMIT_MEMLOCK. Since these mappings are already locked an attempt to > > > mlock() secretmem range would fail and mlockall() will ignore secretmem > > > mappings. > > > > What about munlock? > > Isn't this implied? ;-) My bad here. I thought that munlock fails on vmas which are not mlocked and I was curious about the behavior when mlockall() is followed by munlock. But I do not see this being the case. So this should be ok. -- Michal Hocko SUSE Labs