Hi Alex, On Fri, Sep 10, 2021 at 03:12:37PM +0200, Alejandro Colomar (man-pages) wrote: > Hi Mike, > > On 9/2/21 9:50 AM, Mike Rapoport wrote: > > From: Mike Rapoport <rppt@xxxxxxxxxxxxx> > > > > ... that explains the rationale for the system call > > > > Signed-off-by: Mike Rapoport <rppt@xxxxxxxxxxxxx> > > I found a few formatting/wording issues (see below; but I fixed them myself, > so you don't need to worry about them). Thanks a lot! > In general, I understood the rationale for the system call, > so I applied the patch to my tree. However, there are some parts that I > didn't understand well, mostly related to kernel internals, but since > Michael knows more about those, I expect him to review those again when I > send him the patch. > Thanks! > > Alex > > > --- > > man2/memfd_secret.2 | 61 +++++++++++++++++++++++++++++++++++++++++++++ > > 1 file changed, 61 insertions(+) > > > > diff --git a/man2/memfd_secret.2 b/man2/memfd_secret.2 > > index f3380818e..869480b48 100644 > > --- a/man2/memfd_secret.2 > > +++ b/man2/memfd_secret.2 > > @@ -147,6 +147,67 @@ system call first appeared in Linux 5.14. > > The > > .BR memfd_secret () > > system call is Linux-specific. > > +.SH NOTES > > +.PP > > Unnecessary .PP after .SH or .SS > > > +The > > +.BR memfd_secret () > > +system call is designed to allow a user-space process > > +to create a range of memory that is inaccessible to anybody else - > > +kernel included. > > +There is no 100% guarantee that kernel won't be able to access > > +memory ranges backed by > > +.BR memfd_secret () > > +in any circumstances, but nevertheless, > > +it is much harder to exfiltrate data from these regions. > > +.PP > > +The > > /The/d > > > +.BR memfd_secret () > > +provides the following protections: > > +.IP \(bu 3 > > +Enhanced protection > > +(in conjunction with all the other in-kernel attack prevention systems) > > +against ROP attacks. > > +Absence of any in-kernel primitive for accessing memory backed by > > +.BR memfd_secret () > > +means that one-gadget ROP attack > > +can't work to perform data exfiltration. > > +The attacker would need to find enough ROP gadgets > > +to reconstruct the missing page table entries, > > +which significantly increases difficulty of the attack, > > +especially when other protections like the kernel stack size limit > > +and address space layout randomization are in place. > > +.IP \(bu > > +Prevent cross-process userspace memory exposures. > > s/userspace/user-space/ > > > +Once a region for a > > +.BR memfd_secret () > > +memory mapping is allocated, > > +the user can't accidentally pass it into the kernel > > +to be transmitted somewhere. > > +The memory pages in this region cannot be accessed via the direct map > > +and they are disallowed in get_user_pages. > > +.IP \(bu > > +Harden against exploited kernel flaws. > > +In order to access memory areas backed by > > +.BR memfd_secret(), > > +a kernel-side attack would need to > > +either walk the page tables and create new ones, > > +or spawn a new privileged userspace process to perform > > s/userspace/user-space/ > > > +secrets exfiltration using > > +.BR ptrace (2). > > +.PP > > +The way > > +.BR memfd_secret () > > +allocates and locks the memory may impact overall system performance, > > +therefore the system call is disabled by default and only available > > +if the system administrator turned it on using > > +"secretmem.enable=y" kernel parameter. > > +.PP > > +To prevent potiential data leaks of memory regions backed by > > +.BR memfd_secret() > > +from a hybernation image, > > +hybernation is prevented when there are active > > +.BR memfd_secret () > > +users. > > .SH SEE ALSO > > .BR fcntl (2), > > .BR ftruncate (2), > > > > > -- > Alejandro Colomar > Linux man-pages comaintainer; https://www.kernel.org/doc/man-pages/ > http://www.alejandro-colomar.es/ -- Sincerely yours, Mike.