On Thursday, January 28, 2016 02:56:11 PM Jan Kara wrote: > Hi, > > On Tue 26-01-16 17:21:08, Eric Sandeen wrote: > > From: Steve Grubb <sgrubb@xxxxxxxxxx> > > > > If a daemon using FANOTIFY needs to open a file on a watched filesystem > > and > > its wanting OPEN_PERM events, we get deadlock. (This could happen because > > of a library the daemon is using suddenly decides it needs to look in a > > new > > file.) Even though the man page says that the daemon should approve its > > own > > access decision, it really can't. If its in the middle of working on a > > request and that in turn generates another request, the second request is > > going to sit in the queue inside the kernel and not be read because the > > daemon is waiting on a library call that will never finish. We also have > > no > > idea how many requests are stacked up before we get to it. So, it really > > can't approve its own access requests. > > > > The solution is to assume that the daemon is going to approve its own file > > access requests. So, any requested access that matches the pid of the > > program receiving fanotify events should be pre-approved in the kernel > > and not sent to user space for approval. This should prevent deadlock. > > > > This behavior only exists if FAN_SELF_APPROVE is in the flags at > > fanotify_init() time. > > > > [Eric Sandeen: Make behavior contingent on fanotify_init flag] > > > > Signed-off-by: Steve Grubb <sgrubb@xxxxxxxxxx> > > Signed-off-by: Eric Sandeen <sandeen@xxxxxxxxxx> > > --- > > > > Resending this; first submission to lkml generated no responses, but > > offline Eric Paris indicated that the original patch was "policy in the > > kernel," so I'll see what people think of making it contingent on an > > fanotify_init flag at syscall time. > > AKPM is merging fanotify patches these days so it is good to CC him. I'm > keeping eye on the notification infrastructure so you can CC me when you > need an opinion. > > Now to the patch: This solution really looks half baked to me. What if > there are two processes mediating the access? While this is certainly possible, is this actually done in real life? > You'll get the deadlock again: We have processes A and B mediating access. A > accesses some file f, A selfapproves the event for itself but the access is > still blocked on the approval from B. B proceeds to process the access > request by A. It accesses some file which generates the permission event - > now B is blocked waiting for A to approve its event. Dang. > > This really resembles a situation when e.g. multipathd must be damn carful > not to generate any IO while it is running lest it deadlocks while > reconfiguring paths. So here you have to be damn careful not to generate > any events when processing fanotify permission events... And I know that is > inconvenient but that's how things were designed and duck taping some > special cases IMHO is not acceptable. The issue here is that any relatively interesting program will have several libraries who could at any time decide it needs to open a file. Perhaps even /etc/hosts for network name resolution. You really don't know what the third party libraries might do and the consequences are pretty severe - deadlock. You could make it multi-threaded and have one thread dequeuing approval requests and another servicing them. But...each request has a live descriptor that counts towards the rlimit for max open descriptors. Hitting that is bad also. The simplest solution is to assume that the daemon is going to approve its own request. It would never refuse its own request, should it? -Steve > One solution which would look reasonably clean to me would be that a > process could have a capability which when set would mean that accesses by > that process would not generate fanotify permission events. This would > really avoid the deadlocks. But then botnet / virus writers would quickly > learn to set this capability for their processes so I'm not sure if this > doesn't somewhat defeat the purpose of permission events. OTOH setting > this capability would be gated by CAP_SYS_ADMIN anyway and once you have > this you have other ways to circumvent any protection so it is not > catastrophical. But still it makes life easier for the bad guys. > > Honza -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html