On Fri, Aug 16, 2024 at 12:15:12PM +0100, Al Viro wrote: > On Fri, Aug 16, 2024 at 10:25:52AM +0200, Christian Brauner wrote: > > > I don't think so. It is clear that the file descriptor table is unshared > > and that fds are closed afterwards and that this can race with file > > descriptors being inserted into the currently shared fdtable. Imho, > > there's nothing to fix here. > > > > I also question whether any userspace out there has any such ordering > > expectations between the two dup2()s and the close_range() call and > > specifically whether we should even bother giving any such guarantees. > > Huh? > > It's not those dup2() vs unsharing; it's relative order of those dup2(). > > Hell, make that > > dup2(0, 1023); > dup2(1023, 10); > > Do you agree that asynchronous code observing 10 already open, but 1023 > still not open would be unexpected? FWIW, for descriptor table unsharing we do (except for that odd case) have the following: * the effect of operations not ordered wrt unshare (i.e. done by another thread with no userland serialization) may or may not be visible in the unshared copy; however, if two operations are ordered wrt to each other, we won't see the effect of the later one without the effect of the earlier. Here neither of those dup2() is ordered wrt unsharing close_range(); we might see the effect of both or none or only the first one, but seeing the effect of the second _without_ the effect of the first is very odd, especially since the effect of the second does depend upon just the state change we do *NOT* see. Actual closing done by unsharing close_range() is not an issue - none of the affected descriptors are getting closed. It's the unshare part that is deeply odd here. And yes, unshare(2) (or clone(2) without CLONE_FILES) would have the ordering warranties I'm talking about.