Re: [PATCH v6 4/5] mm/migrate: skip migrating folios under writeback with AS_WRITEBACK_INDETERMINATE mappings

Bernd Schubert <bernd.schubert@xxxxxxxxxxx> · Thu, 26 Dec 2024 23:44:37 +0100

On 12/23/24 20:00, Joanne Koong wrote:
> On Sat, Dec 21, 2024 at 1:59 PM Bernd Schubert
> <bernd.schubert@xxxxxxxxxxx> wrote:
>>
>>
>>
>> On 12/21/24 17:25, David Hildenbrand wrote:
>>> On 20.12.24 22:01, Joanne Koong wrote:
>>>> On Fri, Dec 20, 2024 at 6:49 AM David Hildenbrand <david@xxxxxxxxxx>
>>>> wrote:
>>>>>
>>>>>>> I'm wondering if there would be a way to just "cancel" the
>>>>>>> writeback and
>>>>>>> mark the folio dirty again. That way it could be migrated, but not
>>>>>>> reclaimed. At least we could avoid the whole
>>>>>>> AS_WRITEBACK_INDETERMINATE
>>>>>>> thing.
>>>>>>>
>>>>>>
>>>>>> That is what I basically meant with short timeouts. Obviously it is not
>>>>>> that simple to cancel the request and to retry - it would add in quite
>>>>>> some complexity, if all the issues that arise can be solved at all.
>>>>>
>>>>> At least it would keep that out of core-mm.
>>>>>
>>>>> AS_WRITEBACK_INDETERMINATE really has weird smell to it ... we should
>>>>> try to improve such scenarios, not acknowledge and integrate them, then
>>>>> work around using timeouts that must be manually configured, and ca
>>>>> likely no be default enabled because it could hurt reasonable use
>>>>> cases :(
>>>>>
>>>>> Right now we clear the writeback flag immediately, indicating that data
>>>>> was written back, when in fact it was not written back at all. I suspect
>>>>> fsync() currently handles that manually already, to wait for any of the
>>>>> allocated pages to actually get written back by user space, so we have
>>>>> control over when something was *actually* written back.
>>>>>
>>>>>
>>>>> Similar to your proposal, I wonder if there could be a way to request
>>>>> fuse to "abort" a writeback request (instead of using fixed timeouts per
>>>>> request). Meaning, when we stumble over a folio that is under writeback
>>>>> on some paths, we would tell fuse to "end writeback now", or "end
>>>>> writeback now if it takes longer than X". Essentially hidden inside
>>>>> folio_wait_writeback().
>>>>>
>>>>> When aborting a request, as I said, we would essentially "end writeback"
>>>>> and mark the folio as dirty again. The interesting thing is likely how
>>>>> to handle user space that wants to process this request right now (stuck
>>>>> in fuse_send_writepage() I assume?), correct?
>>>>
>>>> This would be fine if the writeback request hasn't been sent yet to
>>>> userspace but if it has and the pages are spliced
>>>
>>> Can you point me at the code where that splicing happens?
>>
>> fuse_dev_splice_read()
>>   fuse_dev_do_read()
>>     fuse_copy_args()
>>       fuse_copy_page
>>
>>
>> Btw, for the non splice case, disabling migration should be
>> only needed while it is copying to the userspace buffer?
> 
> I don't think so. We don't currently disable migration when copying
> to/from the userspace buffer for reads.

Sorry for my late reply. I'm confused about "reads". This discussions
is about writeback?
Without your patches we have tmp-pages - migration disabled on these. 
With your patches we have AS_WRITEBACK_INDETERMINATE - migration
also disabled?

I think we have two code paths

a) fuse_dev_read - does a full buffer copy. Why do we need tmp-pages
for these at all? The only time migration must not run on these pages
while it is copying to the userspace buffer?

b) fuse_dev_splice_read - isn't this our real problem, as we don't
know when pages in the pipe are getting consumed?

Thanks,
Bernd