> > -----Original Message----- > From: Amir Goldstein <amir73il@xxxxxxxxx> > Sent: Friday, September 27, 2024 3:40 AM > To: Krishna Vivek Vitta <kvitta@xxxxxxxxxxxxx> > Cc: Jan Kara <jack@xxxxxxx>; linux-fsdevel@xxxxxxxxxxxxxxx; Eric Van Hensbergen <ericvh@xxxxxxxxxx>; Latchesar Ionkov <lucho@xxxxxxxxxx>; Dominique Martinet <asmadeus@xxxxxxxxxxxxx>; v9fs@xxxxxxxxxxxxxxx > Subject: Re: [EXTERNAL] Re: Git clone fails in p9 file system marked with FANOTIFY > > On Thu, Sep 26, 2024 at 10:28 PM Amir Goldstein <amir73il@xxxxxxxxx> wrote: > > > > > > What would be the next steps for this investigation ? > > > > > > > > > > I need to find some time and to debug the reason for 9p open failure > > > so we can make sure the problem is in 9p code and report more > > > details of the bug to 9p maintainers, but since a simple reproducer > > > exists, they can also try to reproduce the issue right now. > > > > FWIW, the attached reproducer mimics git clone rename_over pattern closer. > > It reproduces fanotify_read() errors sometimes, not always, with > > either CLOSE_WRITE or OPEN_PERM | CLOSE_WRITE. > > maybe CLOSE_WRITE alone has better odds - I'm not sure. > > > > scratch that. > I think the renames were just a destruction git clone events do not always fail on a close+rename pattern, they always fail on the close+unlink that follows the renames: > > 1776 openat(AT_FDCWD, "/vtmp/filebench/.git/tjEzMUw", O_RDWR|O_CREAT|O_EXCL, 0600) = 3 > 1776 close(3) = 0 > 1776 unlink("/vtmp/filebench/.git/tjEzMUw") = 0 > 1776 symlink("testing", "/vtmp/filebench/.git/tjEzMUw") = 0 > 1776 lstat("/vtmp/filebench/.git/tjEzMUw", {st_mode=S_IFLNK|0777, st_size=7, ...}) = 0 > 1776 unlink("/vtmp/filebench/.git/tjEzMUw") = 0 > > I know because I added this print: > > --- a/fs/notify/fanotify/fanotify_user.c > +++ b/fs/notify/fanotify/fanotify_user.c > @@ -275,6 +275,7 @@ static int create_fd(struct fsnotify_group *group, const struct path *path, > */ > put_unused_fd(client_fd); > client_fd = PTR_ERR(new_file); > + pr_warn("%s(%pd2): ret=%d\n", __func__, path->dentry, > client_fd); > } else { > > We should consider adding it as is or maybe ratelimited. > > The trivial reproducer below fails fanotify_read() always with one try. > > Thanks, > Amir. > > int main() { > const char *filename = "config.lock"; > int fd; > > // Create a new file > fd = open(filename, O_WRONLY | O_CREAT | O_TRUNC, 0644); > if (fd == -1) { > perror("Failed to create file"); > return EXIT_FAILURE; > } > close(fd); > > // Remove the file > if (unlink(filename) != 0) { > perror("Failed to unlink file"); > return EXIT_FAILURE; > } > > return EXIT_SUCCESS; > } On Fri, Sep 27, 2024 at 8:34 AM Krishna Vivek Vitta <kvitta@xxxxxxxxxxxxx> wrote: > > Hi Amir Hi Krishna, Please do not "top post" on mailing lists. It makes it very hard for people that are trying to follow the conversation in mailing archives or join in the middle of the conversation. > > Thanks for the experiment. > Though reproducer program is succeeding but fanotify listener is terminating since its failing to read the event. Right ? Right, the fanotify_example.c is programmed to stop on event read error. Productions fanotify listeners should log the error or ignore it, but not abort. > > Can you elaborate on: " We should consider adding it as is or maybe ratelimited." ? When fanotify fails to open an fd for reporting an event, we either return the error to fanotify_read() or silently ignore the event. What I meant is that we should add the pr_warn() in that case to make the problem more visible to sysadmins. > > Does this mean there should be a fix at fanotify side ? Yes, I think there should be a fix in fanotify. 9p is not doing something terribly wrong and as Jan already wrote, the exact same case from nfs/cifs is handled by ignoring the open error. > > Can you summarize the problem statement for once in larger interest of group. > Yes, I will try. fanotify needs to report an open fd with the event (in some operation modes). The open fd needs to be created and installed in the fd table of the listener process that is calling fanotify_read() to read the event. By the time that the listener process wants to read the events, the file in question may have already been unlinked. fanotify holds a reference on the dentry where an event happened while the event is in the queue. In local filesystems, it should not be a problem to open an unlinked file from a referenced dentry. For example, If adding sleep(1) before handle_events() then fanotify_example will print these messages from ext/xfs: FAN_CLOSE_WRITE: File /vdc/config.lock (deleted) The problem is that remote/network filesystems cannot provide full guarantee that an unlinked file can be opened even if the dentry reference exists. nfs/nfs4/cifs clients return -EOPENSTALE in that case, so vfs/nfsd can retry the open and fanotify also recognizes this error and silently drops the event. But vfs/fanotify cannot guarantee that all the remote/network filesystems will return this special error code and as the comment in create_fd() says, there are many other cases that open can fail: /* * we still send an event even if we can't open the file. this * can happen when say tasks are gone and we try to open their * /proc files or we try to open a WRONLY file like in sysfs * we just send the errno to userspace since there isn't much * else we can do. */ The problems are: 1. The first line of the comment is wrong - we are not sending the event 2. We only send the errno to userspace if this is the first event being read 3. Even if userspace gets the errno, there is very little visibility about the event that caused the error (*) (*) This can be improved by the pr_warn() that I suggested Frankly, I never understood why this situation is not handled as the first line of the comment claims - that is to send the event with event->fd = -errno. This way userspace gets an event about every error and get try to do something about it (start an investigation). Getting the errno sometimes and ignoring it sometimes is a very poor UAPI. > I assume the whole of these experiments succeed in ext, xfs. > Yes. This specific experiment succeeds on ext/xfs. However, other failures to open fd for event can also happen on ext/xfs, for example if the listener is watching FAN_CLOSE|FAN_OPEN and fanotify_init() uses O_WRONLY as event_f_flags argument and the watched fs is mounted read-only, then you would get an error like this in fanotify_example, after opening a file for read on ext/xfs: read: Read-only file system At the bottom line, this is a UAPI issue. The UAPI can be improved, but we will probably have to make a new opt-in flag for a new UAPI. But I must say that I do not fully understand your original report. Your original report claimed that git clone fails when MDE(Microsoft Defender for Endpoint) is watching CLOSE_WRITE events on a 9p mount. Failures to report a FAN_CLOSE_WRITE event should have never failed git clone. Only failure to report FAN_OPEN_PERM could have failed git clone operations, but in that case, open would have failed anyway. My guess is that a would-be ENOENT open failure is replaced with an EPERM open failure when fanotify is watching FAN_OPEN_PERM events, but I have no confirmation to this assumption from your strace. Unless I misunderstand the report, it sounds like the fix (to git clone failure) should be in MDE software, because I could never reproduce git clone failure with fanotify_example, no matter with events I tried. Anyway, if I misunderstood something in the report, please clarify. Thanks, Amir.