On Mon, Jul 23, 2018 at 3:37 PM, Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote: > On Mon, Jul 23, 2018 at 3:05 PM, Miklos Szeredi <miklos@xxxxxxxxxx> wrote: >> Biggest conceptual problem: your definition of fuse-server is weak. >> Take the following example: process A is holding the fuse device fd >> and is forwarding requests and replies to/from process B via a pipe. >> So basically A is just a proxy that does nothing interesting, the >> "real" server is B. But according to your definition B is not a >> server, only A is. > > I proposed to abort fuse conn when all fuse device fd's are "killed" > (all processes having the fd opened are killed). So if _only_ process > B is killed, then, yes, it will still hang. However if A is killed or > both A and B (say, process group, everything inside of pid namespace, > etc) then the deadlock will be autoresolved without human > intervention. Okay, so you're saying: 1) when process gets SIGKILL and is uninterruptible sleep mark process as doomed 2) for a particular fuse instance find set of fuse device fd references that are in non-doomed tasks; if there are none then abort fuse instance Right? The above is not an implementation proposal, just to get us on the same page regarding the concept. >> And this is just a simple example, parts of the server might be on >> different machines, etc... It's impossible to automatically detect if >> a process is acting as a fuse server or not. > > It does not seem we need the precise definition. If no one ever can > write anything into the fd, we can safely abort the connection (?). Seems to me so. > If > we don't, we can either get that the process exits normally and the > connection is doomed anyway, so no difference in behavior, or we can > get a deadlock. > >> We could let the fuse server itself notify the kernel that it's a fuse >> server. That might help in the cases where the deadlock is >> accidental, but obviously not in the case when done by a malicious >> agent. I'm not sure it's worth the effort. Also I have no idea how >> the respective maintainers would take the idea of "kill hooks"... It >> would probably be a lot of work for little gain. > > What looks wrong to me here is that fuse is only (?) subsystem in > kernel that stops SIGKILL from working and requires complex custom > dance performed by a human operator (which is not necessary there at > all). Say, if a process has opened a socket, whatever, I don't need to > locate and abort something in socketctl fs, just SIGKILL. If a > processes has opened a file, I don't need to locate the fd in /proc > and abort it, just SIGKILL. If a process has created an ipc object, I > don't need to do any special dance, just SIGKILL. fuse is somehow very > special, if we have more such cases, it definitely won't scale. > I understand that there can be implementation difficulties, but > fundamentally that's how things should work -- choose target > processes, kill, done, right? Yes, it would be nice. But I'm not sure it will fly due to implementation difficulties. It's definitely not a high prio feature currently for me, but I'll happily accept patches. Thanks, Miklos