On 12/19/24 18:37, Shakeel Butt wrote: > On Thu, Dec 19, 2024 at 06:30:34PM +0100, Bernd Schubert wrote: >> >> >> On 12/19/24 18:26, David Hildenbrand wrote: >>> On 19.12.24 18:14, Shakeel Butt wrote: >>>> On Thu, Dec 19, 2024 at 05:41:36PM +0100, David Hildenbrand wrote: >>>>> On 19.12.24 17:40, Shakeel Butt wrote: >>>>>> On Thu, Dec 19, 2024 at 05:29:08PM +0100, David Hildenbrand wrote: >>>>>> [...] >>>>>>>> >>>>>>>> If you check the code just above this patch, this >>>>>>>> mapping_writeback_indeterminate() check only happen for pages under >>>>>>>> writeback which is a temp state. Anyways, fuse folios should not be >>>>>>>> unmovable for their lifetime but only while under writeback which is >>>>>>>> same for all fs. >>>>>>> >>>>>>> But there, writeback is expected to be a temporary thing, not >>>>>>> possibly: >>>>>>> "AS_WRITEBACK_INDETERMINATE", that is a BIG difference. >>>>>>> >>>>>>> I'll have to NACK anything that violates ZONE_MOVABLE / ALLOC_CMA >>>>>>> guarantees, and unfortunately, it sounds like this is the case >>>>>>> here, unless >>>>>>> I am missing something important. >>>>>>> >>>>>> >>>>>> It might just be the name "AS_WRITEBACK_INDETERMINATE" is causing >>>>>> the confusion. The writeback state is not indefinite. A proper fuse fs, >>>>>> like anyother fs, should handle writeback pages appropriately. These >>>>>> additional checks and skips are for (I think) untrusted fuse servers. >>>>> >>>>> Can unprivileged user space provoke this case? >>>> >>>> Let's ask Joanne and other fuse folks about the above question. >>>> >>>> Let's say unprivileged user space can start a untrusted fuse server, >>>> mount fuse, allocate and dirty a lot of fuse folios (within its dirty >>>> and memcg limits) and trigger the writeback. To cause pain (through >>>> fragmentation), it is not clearing the writeback state. Is this the >>>> scenario you are envisioning? >>> >>> Yes, for example causing harm on a shared host (containers, ...). >>> >>> If it cannot happen, we should make it very clear in documentation and >>> patch descriptions that it can only cause harm with privileged user >>> space, and that this harm can make things like CMA allocations, memory >>> onplug, ... fail, which is rather bad and against concepts like >>> ZONE_MOVABLE/MIGRATE_CMA. >>> >>> Although I wonder what would happen if the privileged user space daemon >>> crashes (e.g., OOM killer?) and simply no longer replies to any messages. >>> >> >> The request is canceled then - that should clear the page/folio state >> >> >> I start to wonder if we should introduce really short fuse request >> timeouts and just repeat requests when things have cleared up. At least >> for write-back requests (in the sense that fuse-over-network might >> be slow or interrupted for some time). >> >> > > Thanks Bernd for the response. Can you tell a bit more about the request > timeouts? Basically does it impact/clear the page/folio state as well? That is just an idea, needs more discussion first. Just sent an off list message.