Re: [PATCH v4 0/2] fuse: add timeout option for requests

Joanne Koong <joannelkoong@xxxxxxxxx> · Wed, 21 Aug 2024 14:22:05 -0700

On Wed, Aug 21, 2024 at 11:54 AM Miklos Szeredi <miklos@xxxxxxxxxx> wrote:
>
> On Wed, 21 Aug 2024 at 20:11, Josef Bacik <josef@xxxxxxxxxxxxxx> wrote:
>
> > "A well written server" is the key part here ;).  In our case we had a "well
> > written server" that ended up having a deadlock and we had to run around with a
> > drgn script to find those hung mounts and kill them manually.  The usecase here
> > is specifically for bugs in the FUSE server to allow us to cleanup automatically
> > with EIO's rather than a drgn script to figure out if the mount is hung.
>
> So you 'd like to automatically abort the connection to an
> unresponsive server?  I'm okay with that.
>
> What I'm worried about is the unintended side effects of timed out
> request without the server's knowledge (i.e. VFS locks released, then
> new request takes VFS lock).   If the connection to the server is
> aborted, then that's not an issue.
>
> It's also much simpler to just time out any response from the server
> (either read or write on /dev/fuse) than having to do per-request
> timeouts.

In our case, the deadlock was triggered by invalidating the inode in
the middle of handling the write request. The server becomes stuck
since the inode invalidation (eg fuse_reverse_inval_inode())  is
attempting to acquire the folio lock but the lock was acquired when
servicing the write request (eg fuse_fill_write_pages()) and only gets
released after the server has replied to the write request (eg in
fuse_send_write_pages()).

Without a kernel enforced timeout, the only way out of this is to
abort the connection. A userspace timeout wouldn't help in this case
with getting the server unstuck. With the kernel timeout, this forces
the kernel handling of the write request to proceed, whihc will drop
the folio lock and resume the server back to a functioning state.

I don't think situations like this are uncommon. For example, it's not
obvious or clear to developers that fuse_lowlevel_notify_inval_inode()
shouldn't be called inside of a write handler in their server code.

I believe Yafang had a use case for this as well in
https://lore.kernel.org/linux-fsdevel/20240724071156.97188-1-laoar.shao@xxxxxxxxx/
where they were seeing fuse connections becoming indefinitely stuck.

For your concern about potential unintended side effects of timed out
requests without the server's knowledge, could you elaborate more on
the VFS locking example? In my mind, a request that times out is the
same thing as a request that behaves normally and completes with an
error code, but perhaps not?

I think also, having some way for system admins to enforce request
timeouts across the board might be useful as well - for example, if a
malignant fuse server doesn't reply to any requests, the requests hog
memory until the server is killed.

Thanks,
Joanne

>
> > It also gives us the opportunity to do the things that Bernd points out,
> > specifically remove the double buffering downside as we can trust that
> > eventually writeback will either succeed or timeout.  Thanks,
>
> Well see this explanation for how this might deadlock on a memory
> allocation by the server:
>
>  https://lore.kernel.org/all/CAJfpegsfF77SV96wvaxn9VnRkNt5FKCnA4mJ0ieFsZtwFeRuYw@xxxxxxxxxxxxxx/
>
> Having a timeout would fix the deadlock, but it doesn't seem to me a
> proper solution.
>
> Thanks,
> Miklos