On Tue, May 25, 2021 at 12:31:33PM +0200, Christian Brauner wrote: > On Mon, May 24, 2021 at 10:47:46AM +0200, Jan Kara wrote: > > On Sat 22-05-21 09:32:36, Matthew Bobrowski wrote: > > > On Fri, May 21, 2021 at 12:40:56PM +0200, Jan Kara wrote: > > > > On Fri 21-05-21 20:15:35, Matthew Bobrowski wrote: > > > > > On Thu, May 20, 2021 at 03:55:27PM +0200, Jan Kara wrote: > > > > > There's one thing that I'd like to mention, and it's something in > > > > > regards to the overall approach we've taken that I'm not particularly > > > > > happy about and I'd like to hear all your thoughts. Basically, with > > > > > this approach the pidfd creation is done only once an event has been > > > > > queued and the notification worker wakes up and picks up the event > > > > > from the queue processes it. There's a subtle latency introduced when > > > > > taking such an approach which at times leads to pidfd creation > > > > > failures. As in, by the time pidfd_create() is called the struct pid > > > > > has already been reaped, which then results in FAN_NOPIDFD being > > > > > returned in the pidfd info record. > > > > > > > > > > Having said that, I'm wondering what the thoughts are on doing pidfd > > > > > creation earlier on i.e. in the event allocation stages? This way, the > > > > > struct pid is pinned earlier on and rather than FAN_NOPIDFD being > > > > > returned in the pidfd info record because the struct pid has been > > > > > already reaped, userspace application will atleast receive a valid > > > > > pidfd which can be used to check whether the process still exists or > > > > > not. I think it'll just set the expectation better from an API > > > > > perspective. > > > > > > > > Yes, there's this race. OTOH if FAN_NOPIDFD is returned, the listener can > > > > be sure the original process doesn't exist anymore. So is it useful to > > > > still receive pidfd of the dead process? > > > > > > Well, you're absolutely right. However, FWIW I was approaching this > > > from two different angles: > > > > > > 1) I wanted to keep the pattern in which the listener checks for the > > > existence/recycling of the process consistent. As in, the listener > > > would receive the pidfd, then send the pidfd a signal via > > > pidfd_send_signal() and check for -ESRCH which clearly indicates > > > that the target process has terminated. > > > > > > 2) I didn't want to mask failed pidfd creation because of early > > > process termination and other possible failures behind a single > > > FAN_NOPIDFD. IOW, if we take the -ESRCH approach above, the > > > listener can take clear corrective branches as what's to be done > > > next if a race is to have been detected, whereas simply returning > > > FAN_NOPIDFD at this stage can mean multiple things. > > > > > > Now that I've written the above and keeping in mind that we'd like to > > > refrain from doing anything in the event allocation stages, perhaps we > > > could introduce a different error code for detecting early process > > > termination while attempting to construct the info record. WDYT? > > > > Sure, I wouldn't like to overengineer it but having one special fd value for > > "process doesn't exist anymore" and another for general "creating pidfd > > failed" looks OK to me. > > FAN_EPIDFD -> "creation failed" > FAN_NOPIDFD -> "no such process" Yes, I was thinking something along the lines of this... With the approach that I've proposed in this series, the pidfd creation failure trips up in pidfd_create() at the following condition: if (!pid || !pid_has_task(pid, PIDTYPE_TGID)) return -EINVAL; Specifically, the following check: !pid_has_task(pid, PIDTYPE_TGID) In order to properly report either FAN_NOPIDFD/FAN_EPIDFD to userspace, AFAIK I'll have to do one of either two things to better distinguish between why the pidfd creation had failed: 1) Implement an additional check in pidfd_create() that effectively checks whether provided pid still holds reference to a struct pid that isn't in the process of being cleaned up. If it is being cleaned up, then return something like -ESRCH instead of -EINVAL so that the caller, in this case fanotify, can check and set FAN_NOPIDFD if -ESRCH is returned from pidfd_create(). I definitely don't feel as though returning -ESRCH from the !pid_has_task(pid, PIDTYPE_TGID) would be appropriate. In saying that, I'm not aware of a helper by which would allow us to perform such an in-flight check? Perhaps something needs to be introduced here, IDK... 2) Refrain from performing any further changes to pidfd_create() i.e. as proposed in option 1), and manually perform the pidfd creation from some kind of new fanotify helper, as suggested by you here [0]. However, I'm not convinved that I like this approach as we may end up slowly drifting away from pidfd creation semantics over time. [0] https://www.spinics.net/lists/linux-fsdevel/msg195556.html /M