On 5/1/20 1:00 PM, Al Viro wrote: > On Fri, May 01, 2020 at 11:54:01AM -0600, Jens Axboe wrote: > >> @@ -427,8 +424,17 @@ static int do_eventfd(unsigned int count, int flags) >> >> fd = anon_inode_getfd("[eventfd]", &eventfd_fops, ctx, >> O_RDWR | (flags & EFD_SHARED_FCNTL_FLAGS)); >> - if (fd < 0) >> + if (fd < 0) { >> eventfd_free_ctx(ctx); >> + } else { >> + struct file *file; >> + >> + file = fget(fd); >> + if (file) { >> + file->f_mode |= FMODE_NOWAIT; >> + fput(file); >> + } > > No. The one and only thing you can do to return value of anon_inode_getfd() is to > return the fscker to userland. You *CAN* *NOT* assume that descriptor table is > still pointing to whatever you've just created. > > As soon as it's in descriptor table, it's out of your hands. And frankly, if you > are playing with descriptors, you should be very well aware of that. > > Descriptor tables are fundamentally shared objects; they *can* be accessed and > modified by other threads, right behind your back. > > *IF* you are going to play with ->f_mode, you must use get_unused_fd_flags(), > anon_inode_getfile(), modify ->f_mode of the result and use fd_install() to > put it into descriptor table. With put_unused_fd() as cleanup in case > of allocation failure. OK, that makes sense, so we've got f_mode set before the fd_install() and fd visibility. I wrote that up, will test, and send out a v4... Thanks Al, this is very helpful. -- Jens Axboe