On Wed 30-03-16 14:47:31, Steve Grubb wrote: > On Thursday, January 28, 2016 02:56:11 PM Jan Kara wrote: > > On Tue 26-01-16 17:21:08, Eric Sandeen wrote: > > > From: Steve Grubb <sgrubb@xxxxxxxxxx> > > > > > > If a daemon using FANOTIFY needs to open a file on a watched filesystem > > > and > > > its wanting OPEN_PERM events, we get deadlock. (This could happen because > > > of a library the daemon is using suddenly decides it needs to look in a > > > new > > > file.) Even though the man page says that the daemon should approve its > > > own > > > access decision, it really can't. If its in the middle of working on a > > > request and that in turn generates another request, the second request is > > > going to sit in the queue inside the kernel and not be read because the > > > daemon is waiting on a library call that will never finish. We also have > > > no > > > idea how many requests are stacked up before we get to it. So, it really > > > can't approve its own access requests. > > > > > > The solution is to assume that the daemon is going to approve its own file > > > access requests. So, any requested access that matches the pid of the > > > program receiving fanotify events should be pre-approved in the kernel > > > and not sent to user space for approval. This should prevent deadlock. > > > > > > This behavior only exists if FAN_SELF_APPROVE is in the flags at > > > fanotify_init() time. > > > > > > [Eric Sandeen: Make behavior contingent on fanotify_init flag] > > > > > > Signed-off-by: Steve Grubb <sgrubb@xxxxxxxxxx> > > > Signed-off-by: Eric Sandeen <sandeen@xxxxxxxxxx> > > > --- > > > > > > Resending this; first submission to lkml generated no responses, but > > > offline Eric Paris indicated that the original patch was "policy in the > > > kernel," so I'll see what people think of making it contingent on an > > > fanotify_init flag at syscall time. > > > > AKPM is merging fanotify patches these days so it is good to CC him. I'm > > keeping eye on the notification infrastructure so you can CC me when you > > need an opinion. > > > > Now to the patch: This solution really looks half baked to me. What if > > there are two processes mediating the access? > > While this is certainly possible, is this actually done in real life? Honestly, I don't know. But it doesn't seem like too exotic configuration to me. Think for example about containers. It just seems as a bad design to restrict fanotify permission events to: "only one process in the system can safely use this". That has always proven to be a bad idea in the long run. > > You'll get the deadlock again: We have processes A and B mediating access. A > > accesses some file f, A selfapproves the event for itself but the access is > > still blocked on the approval from B. B proceeds to process the access > > request by A. It accesses some file which generates the permission event - > > now B is blocked waiting for A to approve its event. Dang. > > > > This really resembles a situation when e.g. multipathd must be damn carful > > not to generate any IO while it is running lest it deadlocks while > > reconfiguring paths. So here you have to be damn careful not to generate > > any events when processing fanotify permission events... And I know that is > > inconvenient but that's how things were designed and duck taping some > > special cases IMHO is not acceptable. > > The issue here is that any relatively interesting program will have several > libraries who could at any time decide it needs to open a file. Perhaps even > /etc/hosts for network name resolution. You really don't know what the third > party libraries might do and the consequences are pretty severe - deadlock. > > You could make it multi-threaded and have one thread dequeuing approval > requests and another servicing them. But...each request has a live descriptor > that counts towards the rlimit for max open descriptors. Hitting that is bad > also. I agree this reduces the attractivity of the two thread design. > The simplest solution is to assume that the daemon is going to approve > its own request. It would never refuse its own request, should it? I agree this is a solution which fixes your usecase but I don't think it is good enough for general use. I won't accept anything that doesn't work for several users of fanotify permission events. As I wrote below in my original email, you can use for example capabilities to make things work for several processes without too big hassle... > > One solution which would look reasonably clean to me would be that a > > process could have a capability which when set would mean that accesses by > > that process would not generate fanotify permission events. This would > > really avoid the deadlocks. But then botnet / virus writers would quickly > > learn to set this capability for their processes so I'm not sure if this > > doesn't somewhat defeat the purpose of permission events. OTOH setting > > this capability would be gated by CAP_SYS_ADMIN anyway and once you have > > this you have other ways to circumvent any protection so it is not > > catastrophical. But still it makes life easier for the bad guys. Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html