On 12/20/24 12:44, David Hildenbrand wrote: > On 19.12.24 18:54, Shakeel Butt wrote: >> On Thu, Dec 19, 2024 at 09:44:42AM -0800, Joanne Koong wrote: >>> On Thu, Dec 19, 2024 at 9:37 AM Shakeel Butt <shakeel.butt@xxxxxxxxx> >>> wrote: >> [...] >>>>> >>>>> The request is canceled then - that should clear the page/folio state >>>>> >>>>> >>>>> I start to wonder if we should introduce really short fuse request >>>>> timeouts and just repeat requests when things have cleared up. At >>>>> least >>>>> for write-back requests (in the sense that fuse-over-network might >>>>> be slow or interrupted for some time). >>>>> >>>>> >>>> >>>> Thanks Bernd for the response. Can you tell a bit more about the >>>> request >>>> timeouts? Basically does it impact/clear the page/folio state as well? >>> >>> Request timeouts can be set by admins system-wide to protect against >>> malicious/buggy fuse servers that do not reply to requests by a >>> certain amount of time. If the request times out, then the whole >>> connection will be aborted, and pages/folios will be cleaned up >>> accordingly. The corresponding patchset is here [1]. This helps >>> mitigate the possibility of unprivileged buggy servers tieing up >>> writeback state by not replying. >>> >> >> Thanks a lot Joanne and Bernd. >> >> David, does these timeouts resolve your concerns? > > Thanks for that information. Yes and no. :) > > Bernd wrote: "I start to wonder if we should introduce really short fuse > request timeouts and just repeat requests when things have cleared up. > At least for write-back requests (in the sense that fuse-over-network > might be slow or interrupted for some time). > > Indicating to me that while timeouts might be supported soon (will there > be a sane default?) even trusted implementations can run into this > (network example above) where timeouts might actually be harmful I suppose? Yeah and that makes it hard to provide a default. In Joannes timeout patches the admin can set a system default. https://lore.kernel.org/all/20241218222630.99920-3-joannelkoong@xxxxxxxxx/ > > I'm wondering if there would be a way to just "cancel" the writeback and > mark the folio dirty again. That way it could be migrated, but not > reclaimed. At least we could avoid the whole AS_WRITEBACK_INDETERMINATE > thing. > That is what I basically meant with short timeouts. Obviously it is not that simple to cancel the request and to retry - it would add in quite some complexity, if all the issues that arise can be solved at all. Thanks, Bernd