Hi Stephen, Linux 5.10 has been recently released. Do you have any updates for this patch? Thanks, Alex On 12/12/20 6:58 PM, Alejandro Colomar (man-pages) wrote: > Hi Christian, > > Makes sense to me. > > Thanks, > > Alex > > On 12/12/20 1:14 PM, Christian Brauner wrote: >> On Thu, Dec 10, 2020 at 03:36:42PM +0100, Alejandro Colomar (man-pages) wrote: >>> Hi Christian, >> >> Hi Alex, >> >>> >>> Thanks for confirming that behavior. Seems reasonable. >>> >>> I was wondering... >>> If this call is equivalent to unshare(2)+{close(2) in a loop}, >>> shouldn't it fail for the same reasons those syscalls can fail? >>> >>> What about the following errors?: >>> >>> From unshare(2): >>> >>> EPERM The calling process did not have the required privi‐ >>> leges for this operation. >> >> unshare(CLONE_FILES) doesn't require any privileges. Only flags relevant >> to kernel/nsproxy.c:unshare_nsproxy_namespaces() require privileges, >> i.e. >> CLONE_NEWNS >> CLONE_NEWUTS >> CLONE_NEWIPC >> CLONE_NEWNET >> CLONE_NEWPID >> CLONE_NEWCGROUP >> CLONE_NEWTIME >> so the permissions are the same. >> >>> >>> From close(2): >>> EBADF fd isn't a valid open file descriptor. >>> >>> OK, this one can't happen with the current code. >>> Let's say there are fds 1 to 10, and you call 'close_range(20,30,0)'. >>> It's a no-op (although it will still unshare if the flag is set). >>> But souldn't it fail with EBADF? >> >> CLOSE_RANGE_UNSHARE should always give you a private file descriptor >> table independent of whether or not any file descriptors need to be >> closed. That's also how we documented the flag: >> >> /* Unshare the file descriptor table before closing file descriptors. */ >> #define CLOSE_RANGE_UNSHARE (1U << 1) >> >> A caller calling unshare(CLONE_FILES) and then an emulated close_range() >> or the proper close_range() syscall wants to make sure that all unwanted >> file descriptors are closed (if any) and that no new file descriptors >> can be injected afterwards. If you skip the unshare(CLONE_FILES) because >> there are no fds to be closed you open up a race window. It would also >> be annoying for userspace if they _may_ have received a private file >> descriptor table but only if any fds needed to be closed. >> >> If people really were extremely keen about skipping the unshare when no >> fd needs to be closed then this could become a new flag. But I really >> don't think that's necessary and also doesn't make a lot of sense, imho. >> >>> >>> EINTR The close() call was interrupted by a signal; see sig‐ >>> nal(7). >>> >>> EIO An I/O error occurred. >>> >>> ENOSPC, EDQUOT >>> On NFS, these errors are not normally reported against >>> the first write which exceeds the available storage >>> space, but instead against a subsequent write(2), >>> fsync(2), or close(). >> >> None of these will be seen by userspace because close_range() currently >> ignores all errors after it has begun closing files. >> >> Christian >> -- Alejandro Colomar Linux man-pages comaintainer; https://www.kernel.org/doc/man-pages/ http://www.alejandro-colomar.es/