Hi Alex, On Sat, 19 Dec 2020 15:00:00 +0100, "Alejandro Colomar (man-pages)" <alx.manpages@xxxxxxxxx> wrote: > Please see some comments below. > It's looking good ;) Thanks for your review and patience! > On 12/18/20 5:58 PM, Stephen Kitt wrote: > > This documents close_range(2) based on information in > > 278a5fbaed89dacd04e9d052f4594ffd0e0585de, > > 60997c3d45d9a67daf01c56d805ae4fec37e0bd8, and > > 582f1fb6b721facf04848d2ca57f34468da1813e. > > > > Signed-off-by: Stephen Kitt <steve@xxxxxxx> > > --- > > V3: fix synopsis overflow > > copy notes from membarrier.2 re the lack of wrapper > > semantic newlines > > drop non-standard "USE CASES" section heading > > add code example > > > > V2: unsigned int to match the kernel declarations > > groff and grammar tweaks > > CLOSE_RANGE_UNSHARE unshares *and* closes > > Explain that EMFILE and ENOMEM can occur with C_R_U > > "Conforming to" phrasing > > Detailed explanation of CLOSE_RANGE_UNSHARE > > Reading /proc isn't common > > > > man2/close_range.2 | 266 +++++++++++++++++++++++++++++++++++++++++++++ > > 1 file changed, 266 insertions(+) > > create mode 100644 man2/close_range.2 > > > > diff --git a/man2/close_range.2 b/man2/close_range.2 > > new file mode 100644 > > index 000000000..f8f2053ac > > --- /dev/null > > +++ b/man2/close_range.2 > > @@ -0,0 +1,266 @@ > > +.\" Copyright (c) 2020 Stephen Kitt <steve@xxxxxxx> > > +.\" > > +.\" %%%LICENSE_START(VERBATIM) > > +.\" Permission is granted to make and distribute verbatim copies of this > > +.\" manual provided the copyright notice and this permission notice are > > +.\" preserved on all copies. > > +.\" > > +.\" Permission is granted to copy and distribute modified versions of > > this +.\" manual under the conditions for verbatim copying, provided that > > the +.\" entire resulting derived work is distributed under the terms of a > > +.\" permission notice identical to this one. > > +.\" > > +.\" Since the Linux kernel and libraries are constantly changing, this > > +.\" manual page may be incorrect or out-of-date. The author(s) assume no > > +.\" responsibility for errors or omissions, or for damages resulting from > > +.\" the use of the information contained herein. The author(s) may not > > +.\" have taken the same level of care in the production of this manual, > > +.\" which is licensed free of charge, as they might when working > > +.\" professionally. > > +.\" > > +.\" Formatted or processed versions of this manual, if unaccompanied by > > +.\" the source, must acknowledge the copyright and authors of this work. > > +.\" %%%LICENSE_END > > +.\" > > +.TH CLOSE_RANGE 2 2020-12-08 "Linux" "Linux Programmer's Manual" > > +.SH NAME > > +close_range \- close all file descriptors in a given range > > +.SH SYNOPSIS > > +.nf > > +.B #include <linux/close_range.h> > > +.PP > > +.BI "int close_range(unsigned int " first ", unsigned int " last , > > +.BI " unsigned int " flags ); > > +.fi > > +.PP > > +.IR Note : > > +There is no glibc wrapper for this system call; see NOTES. > > +.SH DESCRIPTION > > +The > > +.BR close_range () > > +system call closes all open file descriptors from > > +.I first > > +to > > +.I last > > +(included). > > +.PP > > +Errors closing a given file descriptor are currently ignored. > > +.PP > > +.I flags > > +can be 0 or set to one or both of the following: > > +.TP > > +.B CLOSE_RANGE_UNSHARE > > +unshares the range of file descriptors from any other processes, > > +before closing them, > > +avoiding races with other threads sharing the file descriptor table. > > +.TP > > +.BR CLOSE_RANGE_CLOEXEC " (since Linux 5.10)" > > |sort > > I prefer alphabetic order rather than adding new items at the bottom. > When lists grow, it becomes difficult to find what you're looking for. > > CLOEXEC should go before UNSHARE. That makes sense. > > +sets the close-on-exec bit instead of immediately closing the file > > +descriptors. > > [ > sets the close-on-exec bit instead of > immediately closing the file descriptors. > ] Is this for semantic reasons, or to balance the lines and make them easier to read in the roff source? > > +.SH RETURN VALUE > > +On success, > > +.BR close_range () > > +returns 0. > > +On error, \-1 is returned and > > +.I errno > > +is set to indicate the cause of the error. > > +.SH ERRORS > > +.TP > > +.B EINVAL > > +.I flags > > +is not valid, or > > +.I first > > +is greater than > > +.IR last . > > +.PP > > +The following can occur with > > +.B CLOSE_RANGE_UNSHARE > > +(when constructing the new descriptor table): > > +.TP > > +.B EMFILE > > +The per-process limit on the number of open file descriptors has been > > reached +(see the description of > > +.B RLIMIT_NOFILE > > +in > > +.BR getrlimit (2)). > > +.TP > > +.B ENOMEM > > +Insufficient kernel memory was available. > > +.SH VERSIONS > > +.BR close_range () > > +first appeared in Linux 5.9. > > +.SH CONFORMING TO > > +.BR close_range () > > +is a nonstandard function that is also present on FreeBSD. > > +.SH NOTES > > +Glibc does not provide a wrapper for this system call; call it using > > +.BR syscall (2). > > +.\" 278a5fbaed89dacd04e9d052f4594ffd0e0585de > > +.SS Closing all open file descriptors > > The comment with the commit would be better inside the section it refers > to, so: > > [ > .SS Closing all open file descriptors > .\" 278a5fbaed89dacd04e9d052f4594ffd0e0585de > ] Indeed! > > +To avoid blindly closing file descriptors in the range of possible > > +file descriptors, > > [ > To avoid blindly closing file descriptors > in the range of possible file descriptors, > ] > > > +this is sometimes implemented (on Linux) by listing open file > > +descriptors in > > [ > this is sometimes implemented (on Linux) > by listing open file descriptors in > ] > > > +.I /proc/self/fd/ > > +and calling > > +.BR close (2) > > +on each one. > > +.BR close_range () > > +can take care of this without requiring > > +.I /proc > > +and with a single system call, > > s/with/within/ > > > +which provides significant performance benefits. > > +.\" 60997c3d45d9a67daf01c56d805ae4fec37e0bd8 > > +.SS Closing file descriptors before exec > > [ > .SS Closing file descriptors before exec > .\" 60997c3d45d9a67daf01c56d805ae4fec37e0bd8 > ] > > > +File descriptors can be closed safely using > > +.PP > > +.in +4n > > +.EX > > +/* we don't want anything past stderr here */ > > +close_range(3, ~0U, CLOSE_RANGE_UNSHARE); > > +execve(....);> +.EE > > +.in > > +.PP > > +.B CLOSE_RANGE_UNSHARE > > +is conceptually equivalent to > > +.PP > > +.in +4n > > +.EX > > +unshare(CLONE_FILES); > > +close_range(first, last, 0); > > +.EE > > +.in > > +.PP > > +but can be more efficient: > > +if the unshared range extends past the current maximum number of file > > +descriptors allocated in the caller's file descriptor table > > [ > if the unshared range extends past > the current maximum number of file descriptors allocated > in the caller's file descriptor table > ] > > > +(the common case when > > +.I last > > +is > > +.BR ~0U ), > > Literal values are not (usually) formatted. > > [ > .I last > is ~0U), > ] > > > +the kernel will unshare a new file descriptor table for the caller up > > +to > > [ > the kernel will unshare a new file descriptor table for the caller up to > ] > > > +.IR first . > > +This avoids subsequent close calls entirely; > > +the whole operation is complete once the table is unshared. > > +.\" 582f1fb6b721facf04848d2ca57f34468da1813e > > +.SS Closing files on \fBexec\fP > > [ > .SS Closing files on \fBexec\fP > .\" 582f1fb6b721facf04848d2ca57f34468da1813e > ] > > > +This is particularly useful in cases where multiple > > +.RB pre- exec > > +setup steps risk conflicting with each other. > > +For example, setting up a > > +.BR seccomp (2) > > +profile can conflict with a > > +.B close_range > > .BR close_range () > > > +call: > > +if the file descriptors are closed before the seccomp profile is set > > .BR seccomp (2) > > > +up, > > Please, split at a different point. > > > +the profile setup can't use them control their closure; > > I don't understand what you wanted to say. them? Oops, I meant "the profile setup can't use them itself, or control their closure". > > > +if the file descriptors are closed afterwards, > > +the seccomp profile can't block the > > +.B close_range > > .BR close_range () > > > +call or any fallbacks. > > +Using > > +.B CLOSE_RANGE_CLOEXEC > > +avoids this: > > +the descriptors can be marked before the seccomp profile is set up, > > .BR seccomp (2) > > > +and the profile can control access to > > +.B close_range > > .BR close_range () > > > +without affecting the calling process. > > +.SH EXAMPLES > > +The following program is designed to be execed by the second program > > +below. > > +It lists its open file descriptors: > > +.PP > > +.in +4n > > +.EX > > +/* listopen.c */ > > + > > +#include <stdio.h> > > +#include <sys/stat.h> > > + > > +int > > +main(int argc, char *argv[]) > > +{ > > + int i; > > We use C99 declarations for loop indices. > > > + struct stat buf; > > + > > + for (i = 0; i < 100; i++) { > > for (int i = 0; i < 100; i++) { > > > + if (!fstat(i, &buf)) > > + printf("FD %d is open.\n", i); > > s/\\/\\e/ > > see: d1a719857b7eb68f5e5c1c965089038dee683240 > > I sometimes forget to fix those after copying the program to the page. > My solution is to copy the rendered text from the man page to a file > and then compile, and those errors become obvious ;) Ah yes, good catch. I was looking into automating checks for the source code included in man pages throughout the project, but that throws a spanner in the works! > > > + } > > + > > + exit(EXIT_SUCCESS); > > +) > > +.EE > > +.in > > +.PP > > +This program executes the command given on its command-line after > > +opening the files listed after the command, > > +and then using > > s/using/uses/ > > > +.B close_range > > .BR close_range () > > > +to close them: > > +.PP > > +.in +4n > > +.EX > > +/* close_range.c */ > > + > > +#include <fcntl.h> > > +#include <linux/close_range.h> > > +#include <stdio.h> > > +#include <stdlib.h> > > +#include <sys/stat.h> > > +#include <sys/syscall.h> > > +#include <sys/types.h> > > +#include <unistd.h> > > + > > +int > > +main(int argc, char *argv[]) > > +{ > > + char *newargv[] = { NULL }; > > + char *newenviron[] = { NULL }; > > + int i; > > dd > > > + > > + if (argc < 3) { > > + fprintf(stderr, "Usage: %s <command-to-run> <files-to-open>\n", > > argv[0]); > > s/\\/\\e/ > > > + exit(EXIT_FAILURE); > > + } > > + > > + for (i = 2; i < argc; i++) { > > for (int i = 2; i < argc; i++) { > > > + if (open(argv[i], O_RDONLY) == -1) { > > + perror(argv[i]); > > + exit(EXIT_FAILURE); > > + } > > + } > > + > > + if (syscall(__NR_close_range, 3, ~0U, CLOSE_RANGE_UNSHARE) == -1) { > > + perror("close_range"); > > + exit(EXIT_FAILURE); > > + } > > + > > + execve(argv[1], newargv, newenviron); > > + perror("execve"); > > + exit(EXIT_FAILURE); > > +} > > +.EE > > +.in > > +.PP > > +We can use the second program to exec the first as follows: > > +.PP > > +.in +4n > > +.EX > > +.RB "$" " make listopen close_range" > > +.RB "$" " ./close_range ./listopen /dev/null /dev/zero" > > +FD 0 is open. > > +FD 1 is open. > > +FD 2 is open. > > +.EE > > +.in > > +.PP > > +Removing the call to > > +.B close_range > > .BR close_range () > > > +will show different output, with the file descriptors for the named > > +files still open. > > [ > will show different output, > with the file descriptors for the named files still open. > ] Thanks, I'll send a v4 with all the fixes above. Regards, Stephen
Attachment:
pgpqu84KjkpjO.pgp
Description: OpenPGP digital signature