Re: [PATCH v2 2/2] fuse: remove tmp folio for writebacks and internal rb tree

Joanne Koong <joannelkoong@xxxxxxxxx> · Thu, 24 Oct 2024 09:54:55 -0700

On Mon, Oct 21, 2024 at 2:05 PM Joanne Koong <joannelkoong@xxxxxxxxx> wrote:
>
> On Mon, Oct 21, 2024 at 3:15 AM Miklos Szeredi <miklos@xxxxxxxxxx> wrote:
> >
> > On Fri, 18 Oct 2024 at 07:31, Shakeel Butt <shakeel.butt@xxxxxxxxx> wrote:
> >
> > > I feel like this is too much restrictive and I am still not sure why
> > > blocking on fuse folios served by non-privileges fuse server is worse
> > > than blocking on folios served from the network.
> >
> > Might be.  But historically fuse had this behavior and I'd be very
> > reluctant to change that unconditionally.
> >
> > With a systemwide maximal timeout for fuse requests it might make
> > sense to allow sync(2), etc. to wait for fuse writeback.
> >
> > Without a timeout allowing fuse servers to block sync(2) indefinitely
> > seems rather risky.
>
> Could we skip waiting on writeback in sync(2) if it's a fuse folio?
> That seems in line with the sync(2) documentation Jingbo referenced
> earlier where it states "The writing, although scheduled, is not
> necessarily complete upon return from sync()."
> https://pubs.opengroup.org/onlinepubs/9699919799/functions/sync.html
>

So I think the answer to this is "no" for Linux. What the Linux man
page for sync(2) says:

"According to the standard specification (e.g., POSIX.1-2001), sync()
schedules the writes, but may return before the actual writing is
done.  However Linux waits for I/O completions, and thus sync() or
syncfs() provide the same guarantees as fsync() called on every file
in the system or filesystem respectively." [1]

Regardless of the compaction / page migration issue then, this
blocking sync(2) is a dealbreaker.

I see two workarounds around this:

1) Optimistically skip the tmp page but add a timeout where if the
server doesn't reply to the writeback in X seconds, then allocate the
tmp folio and clear writeback immediately on the original folio).
This would address any page migration deadlocks as well. And probably
we don't need the reclaim patch either then, since that could also be
handled by the timeout.
This would make 99% of writebacks fast but in the case of a malicious
fuse server, could make sync() and page migration wait an extra X
seconds.

2) Only skip the tmp folio for privileged fuse servers (we'd still
need to address the page migration path)

Would love to hear thoughts on this. Should we go ahead with option 1?

[1] https://man7.org/linux/man-pages/man2/sync.2.html

>
> Thanks,
> Joanne
> >
> > Thanks,
> > Miklos