Re: [PATCH 08/23] man2: new page describing memfd_secret() system call

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Michael,

On Mon, Aug 09, 2021 at 04:00:46AM +0200, Michael Kerrisk (man-pages) wrote:
> Hi Mike and Alex,
> 
> I think some more work is needed for this page. Mike, would
> you be willing to do some work on the points below please?

I'm really stretched at the moment, so it'll take a while.

What do you say about starting without the elaborate NOTES section and only
updating the page according to your comments below?

I will add the NOTES section a bit later then.
 
> On 8/8/21 10:41 AM, Alejandro Colomar wrote:
> > From: Mike Rapoport <rppt@xxxxxxxxxxxxx>
> > 
> > Signed-off-by: Mike Rapoport <rppt@xxxxxxxxxxxxx>
> > Signed-off-by: Alejandro Colomar <alx.manpages@xxxxxxxxx>
> > ---
> >  man2/memfd_secret.2 | 146 ++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 146 insertions(+)
> >  create mode 100644 man2/memfd_secret.2
> > 
> > diff --git a/man2/memfd_secret.2 b/man2/memfd_secret.2
> > new file mode 100644
> > index 000000000..466aa4236
> > --- /dev/null
> > +++ b/man2/memfd_secret.2
> > @@ -0,0 +1,146 @@
> > +.\" Copyright (c) 2021, IBM Corporation.
> > +.\" Written by Mike Rapoport <rppt@xxxxxxxxxxxxx>
> > +.\"
> > +.\" Based on memfd_create(2) man page
> > +.\" Copyright (C) 2014 Michael Kerrisk <mtk.manpages@xxxxxxxxx>
> > +.\" and Copyright (C) 2014 David Herrmann <dh.herrmann@xxxxxxxxx>
> > +.\"
> > +.\" %%%LICENSE_START(GPLv2+)
> > +.\"
> > +.\" This program is free software; you can redistribute it and/or modify
> > +.\" it under the terms of the GNU General Public License as published by
> > +.\" the Free Software Foundation; either version 2 of the License, or
> > +.\" (at your option) any later version.
> > +.\"
> > +.\" This program is distributed in the hope that it will be useful,
> > +.\" but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +.\" MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> > +.\" GNU General Public License for more details.
> > +.\"
> > +.\" You should have received a copy of the GNU General Public
> > +.\" License along with this manual; if not, see
> > +.\" <http://www.gnu.org/licenses/>.
> > +.\" %%%LICENSE_END
> > +.\"
> > +.TH MEMFD_SECRET 2 2020-08-02 Linux "Linux Programmer's Manual"
> > +.SH NAME
> > +memfd_secret \- create an anonymous file to access secret memory regions
> > +.SH SYNOPSIS
> > +.nf
> > +.PP
> > +.BR "#include <sys/syscall.h>" "      /* Definition of " SYS_* " constants */"
> > +.B #include <unistd.h>
> > +.PP
> > +.BI "int syscall(SYS_memfd_secret, unsigned int " flags );
> > +.fi
> > +.PP
> > +.IR Note :
> > +glibc provides no wrapper for
> > +.BR memfd_secret (),
> > +necessitating the use of
> > +.BR syscall (2).
> > +.SH DESCRIPTION
> > +.BR memfd_secret ()
> > +creates an anonymous file and returns a file descriptor that refers to it.
> 
> s/file/RAM-based file/
> 
> > +The file provides a way to create and access memory regions
> > +with stronger protection than usual RAM-based files and
> > +anonymous memory mappings.
> > +Once all references to the file are dropped, it is automatically released.
> 
> "dropped" is not clear. Should it be something like:
> 
>     Once all open references to the file are closed,
> 
> > +The initial size of the file is set to 0.
> > +Following the call, the file size should be set using
> > +.BR ftruncate (2).
> > +.PP
> > +The memory areas backing the file created with
> > +.BR memfd_create(2)
> > +are visible only to the contexts that have access to the file descriptor.
> 
> "contexts" is not clear here. Can you reword to explain what you mean?
> (processes, threads, something else?)
> 
> > +These areas are removed from the kernel page tables
> 
> s/These areas are/The memory region is/
> 
> > +and only the page tables of the processes holding the file descriptor
> > +map the corresponding physical memory.
> 
> Perhaps a sentence here such as:
> 
> "(Thus, the pages in the region can't be accessed by the kernel itself,
> so that, for example, pointers to the region can't be passed to 
> system calls.)"
> 
> > +.PP
> > +The following values may be bitwise ORed in
> > +.I flags
> > +to control the behavior of
> > +.BR memfd_secret (2):
> > +.TP
> > +.B FD_CLOEXEC
> > +Set the close-on-exec flag on the new file descriptor.
> 
> s/.$/, which causes the region to be removed from the process on execve(2)./
> 
> > +See the description of the
> > +.B O_CLOEXEC
> > +flag in
> > +.BR open (2)
> > +for reasons why this may be useful.
> 
> Maybe the previous sentence is not necessary?
> 
> > +.PP
> > +As its return value,
> > +.BR memfd_secret ()
> > +returns a new file descriptor that can be used to refer to an anonymous file.
> 
> s/that can be used to refer/that refers/
> 
> > +This file descriptor is opened for both reading and writing
> > +.RB ( O_RDWR )
> > +and
> > +.B O_LARGEFILE
> > +is set for the file descriptor.
> > +.PP
> > +With respect to
> > +.BR fork (2)
> > +and
> > +.BR execve (2),
> > +the usual semantics apply for the file descriptor created by
> > +.BR memfd_secret ().
> > +A copy of the file descriptor is inherited by the child produced by
> > +.BR fork (2)
> > +and refers to the same file.
> > +The file descriptor is preserved across
> > +.BR execve (2),
> > +unless the close-on-exec flag has been set.
> > +.PP
> > +The memory regions backed with
> > +.BR memfd_secret ()
> > +are locked in the same way as
> > +.BR mlock (2),
> 
> I find the wording here just a little unclear
> 
> How about:
> 
>     The memory region is locked into memory in the same way as
>     with mlock(2), so that it will never be written into swap
> 
> > +however the implementation will not try to> +populate the whole range during the
> > +.BR mmap (2)
> > +call.
> 
> s/call./call that attaches the region into the process's address space;
>         instead, the pages are only actually allocated as they are
>         faulted in./
> 
> > +The amount of memory allowed for memory mappings
> > +of the file descriptor obeys the same rules as
> > +.BR mlock (2)
> > +and cannot exceed
> > +.BR RLIMIT_MEMLOCK .
> > +.SH RETURN VALUE
> > +On success,
> > +.BR memfd_secret ()
> > +returns a new file descriptor.
> > +On error, \-1 is returned and
> > +.I errno
> > +is set to indicate the error.
> > +.SH ERRORS
> > +.TP
> > +.B EINVAL
> > +.I flags
> > +included unknown bits.
> > +.TP
> > +.B EMFILE
> > +The per-process limit on the number of open file descriptors has been reached.
> > +.TP
> > +.B EMFILE
> > +The system-wide limit on the total number of open files has been reached.
> > +.TP
> > +.B ENOMEM
> > +There was insufficient memory to create a new anonymous file.
> > +.TP
> > +.B ENOSYS
> > +.BR memfd_secret ()
> > +is not implemented on this architecture.
> > +.SH VERSIONS
> > +The
> > +.BR memfd_secret (2)
> > +system call first appeared in Linux 5.14.
> > +.SH CONFORMING TO
> > +The
> > +.BR memfd_secret (2)
> > +system call is Linux-specific.
> > +.SH SEE ALSO
> > +.BR fcntl (2),
> > +.BR ftruncate (2),
> > +.BR mlock (2),
> > +.BR mmap (2),
> > +.BR setrlimit (2)
> 
> I feel like this page could benefit from a NOTES section
> that explains the rationale for the system call. This could
> note that the fact that the region is not accessible from the
> kernel removes a whole class of security attacks.
> 
> Also, the NOTES section could mention the  "secretmem_enable"
> boot option, what its purpose is, what values it can have,
> and what is default behavior if this option is not specified.
> 
> Also, is ti still the case that if this system call is used,
> then users can no longer hibernate their systems? If so,
> this really should be mentioned in NOTES!
> 
> Also, in NOTES perhaps it is worth mentioning that the
> pages in the region can enter the cache (right?).
> 
> Perhaps Jon's articles at https://lwn.net/Articles/865256/
> https://lwn.net/Articits/835342/ and https://lwn.net/Articles/812325/,
> as well as your own commit message
> (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1507f51255c9)
> may inspire some other ideas on details that should be included
> in NOTES.
> 
> Thanks,
> 
> Michael
> 
> -- 
> Michael Kerrisk
> Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
> Linux/UNIX System Programming Training: http://man7.org/training/

-- 
Sincerely yours,
Mike.



[Index of Archives]     [Kernel Documentation]     [Netdev]     [Linux Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux