From: Steve Grubb <sgrubb@xxxxxxxxxx> Hello, If a daemon using FANOTIFY needs to open a file on a watched filesystem and its wanting OPEN_PERM events, we get deadlock. (This could happen because of a library the daemon is using suddenly decides it needs to look in a new file.) Even though the man page says that the daemon should approve its own access decision, it really can't. If its in the middle of working on a request and that in turn generates another request, the second request is going to sit in the queue inside the kernel and not be read because the daemon is waiting on a library call that will never finish. We also have no idea how many requests are stacked up before we get to it. So, it really can't approve its own access requests. The solution is to assume that the daemon is going to approve its own file access requests. So, any requested access that matches the pid of the program receiving fanotify events should be pre-approved in the kernel and not sent to user space for approval. This should prevent deadlock. This behavior only exists if FAN_SELF_APPROVE is in the flags at fanotify_init() time. [Eric Sandeen: Make behavior contingent on fanotify_init flag] Signed-off-by: Steve Grubb <sgrubb@xxxxxxxxxx> Signed-off-by: Eric Sandeen <sandeen@xxxxxxxxxx> --- Resending this; first submission to lkml generated no responses, but offline Eric Paris indicated that the original patch was "policy in the kernel," so I'll see what people think of making it contingent on an fanotify_init flag at syscall time. diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c index d2f97ec..9b5802c 100644 --- a/fs/notify/fanotify/fanotify.c +++ b/fs/notify/fanotify/fanotify.c @@ -105,6 +105,8 @@ static bool fanotify_should_send_event(struct fsnotify_mark *inode_mark, { __u32 marks_mask, marks_ignored_mask; struct path *path = data; + pid_t grp_pid; + struct pid *cur_pid; pr_debug("%s: inode_mark=%p vfsmnt_mark=%p mask=%x data=%p" " data_type=%d\n", __func__, inode_mark, vfsmnt_mark, @@ -139,6 +141,17 @@ static bool fanotify_should_send_event(struct fsnotify_mark *inode_mark, BUG(); } + /* Auto-approve the listening process's own requests if asked to */ + grp_pid = pid_nr(vfsmnt_mark->group->fanotify_data.pid); + if (grp_pid) { + cur_pid = get_pid(task_tgid(current)); + if (grp_pid == pid_nr(cur_pid)) { + put_pid(cur_pid); + return false; + } + put_pid(cur_pid); + } + if (d_is_dir(path->dentry) && !(marks_mask & FS_ISDIR & ~marks_ignored_mask)) return false; diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c index 8e8e6bc..c81cee8 100644 --- a/fs/notify/fanotify/fanotify_user.c +++ b/fs/notify/fanotify/fanotify_user.c @@ -387,6 +387,8 @@ static int fanotify_release(struct inode *ignored, struct file *file) */ wake_up(&group->fanotify_data.access_waitq); #endif + /* Get rid of reference held since fanotify_init */ + put_pid(group->fanotify_data.pid); /* matches the fanotify_init->fsnotify_alloc_group */ fsnotify_destroy_group(group); @@ -741,6 +743,11 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags) group->fanotify_data.user = user; atomic_inc(&user->fanotify_listeners); + if (flags & FAN_SELF_APPROVE) + group->fanotify_data.pid = get_pid(task_tgid(current)); + else + group->fanotify_data.pid = 0; + oevent = fanotify_alloc_event(NULL, FS_Q_OVERFLOW, NULL); if (unlikely(!oevent)) { fd = -ENOMEM; diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h index 533c440..48938ad 100644 --- a/include/linux/fsnotify_backend.h +++ b/include/linux/fsnotify_backend.h @@ -16,6 +16,7 @@ #include <linux/spinlock.h> #include <linux/types.h> #include <linux/atomic.h> +#include <linux/pid.h> /* * IN_* from inotfy.h lines up EXACTLY with FS_*, this is so we can easily @@ -184,6 +185,7 @@ struct fsnotify_group { int f_flags; unsigned int max_marks; struct user_struct *user; + struct pid *pid; } fanotify_data; #endif /* CONFIG_FANOTIFY */ }; diff --git a/include/uapi/linux/fanotify.h b/include/uapi/linux/fanotify.h index 030508d..5b4ce4e 100644 --- a/include/uapi/linux/fanotify.h +++ b/include/uapi/linux/fanotify.h @@ -35,6 +35,7 @@ #define FAN_UNLIMITED_QUEUE 0x00000010 #define FAN_UNLIMITED_MARKS 0x00000020 +#define FAN_SELF_APPROVE 0x00000040 /* listener pid auto-approved */ #define FAN_ALL_INIT_FLAGS (FAN_CLOEXEC | FAN_NONBLOCK | \ FAN_ALL_CLASS_BITS | FAN_UNLIMITED_QUEUE |\ -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html