On Tue, Feb 6, 2024 at 6:30 PM Sargun Dhillon <sargun@xxxxxxxxx> wrote: > > On Tue, Feb 6, 2024 at 6:50 AM Jan Kara <jack@xxxxxxx> wrote: > > > > On Tue 06-02-24 09:44:29, Amir Goldstein wrote: > > > On Tue, Feb 6, 2024 at 1:24 AM Sargun Dhillon <sargun@xxxxxxxxx> wrote: > > > > > > > > One of the issues we've hit recently while using fanotify in an HSM is > > > > racing with files that are opened for execution. > > > > > > > > There is a race that can result in ETXTBUSY. > > > > Pid 1: You have a file marked with FAN_OPEN_EXEC_PERM. > > > > Pid 2: execve(file_by_path) > > > > Pid 1: gets notification, with file.fd > > > > Pid 2: blocked, waiting for notification to resolve > > > > Pid 1: Does work with FD (populates the file) > > > > Pid 1: writes FAN_ALLOW to the fanotify file descriptor allowing the event. > > > > Pid 2: continues, and falls through to deny_write_access (and fails) > > > > Pid 1: closes fd > > > > Right, this is kind of nasty. > > > > > > Pid 1 can close the FD before responding, but this can result in a > > > > race if fanotify is being handled in a multi-threaded > > > > manner. > > > > Yep. > > > > > > I.e. if there are two threads operating on the same fanotify group, > > > > and an event's FD has been closed, that can be reused > > > > by another event. This is largely not a problem because the > > > > outstanding events are added in a FIFO manner to the outstanding > > > > event list, and as long as the earlier event is closed and responded > > > > to without interruption, it should be okay, but it's difficult > > > > to guarantee that this happens, unless event responses are serialized > > > > in some fashion, with strict ordering between > > > > responses. > > > > Yes, essentially you must make sure you will not read any new events from > > the notification queue between fd close & writing of the response. Frankly, > > I find this as quite ugly and asking for trouble (subtle bugs down the > > line). > > > Is there a preference for either refactoring fanotify_event_metadata, or > adding this new ID type as a piece of metadata? > > I almost feel like the FD should move to being metadata, and we should > use ID in place of fd in fanotify_event_metadata. If we use an xarray, > it should be reasonable to use a 32-bit identifier, so we don't need > to modify the fanotify_event_metadata structure at all. > I have a strong preference for FANOTIFY_METADATA_VERSION 4 because I really would like event->key to be 64bit and in the header, but I have a feeling that Jan may have a different opinion.. > > > > There are a couple of ways I see around this: > > > > 1. Have a flag in the fanotify response that's like FAN_CLOSE_FD, > > > > where fanotify_write closes the fd when > > > > it processes the response. > > > > > > That seems doable and useful. > > > > I can work on this. We already have FAN_AUDIT and FAN_INFO as optional > bits that can be set in the response. I know there's another thread talking > about using the high-bits for error codes, and I wouldn't want to pollute > the higher bits too much if we go down that route. > There is plenty of bits space still. I could implement FAN_CLOSE_FD if we get to a conclusion that it will help. it should be quite trivial so better not do conflicting work in parallel. But I think that Jan's idea of not opening event->fd at all may be better. > I'm also not sure how best to implement capability probing (other than you > get an EINVAL in userspace and retry without that bit being set). > Yes, easy. You can event write a response with FAN_NOFD for fd after fanotify_init() to check capability. If capability is supported you will get -ENOENT otherwise -INVAL. Thanks, Amir.