Re: [PATCH RESEND v6 00/16] mm: Page fault enhancements

David Hildenbrand <david@xxxxxxxxxx> · Sun, 8 Mar 2020 13:12:34 +0100

[...]

> Yes, IIUC the race can happen like this in your below test:
> 
>      main thread          uffd thread             disgard thread
>      ===========          ===========             ==============
>      access page
>        uffd page fault
>          wait for page
>                           UFFDIO_ZEROCOPY
>                             put a page P there
>                                                   MADV_DONTNEED on P
>                             wakeup main thread
>          return from fault
>        page still missing
>        uffd page fault again
>        (without ALLOW_RETRY)
>        --> SIGBUS.

Exactly!

>> Can we please have a way to identify that this "feature" is available?
>> I'd appreciate a new read-only UFFD_FEAT_ , so we can detect this from
>> user space easily and use concurrent discards without crashing our applications.
> 
> I'm not sure how others think about it, but to me this still fells
> into the bucket of "solving an existing problem" rather than a
> feature.  Also note that this should change the behavior for the page
> fault logic in general, rather than an uffd-only change. So I'm also
> not sure whether UFFD_FEAT_* suites here even if we want it.

So, are we planning on backporting this to stable kernels?

Imagine using this in QEMU/KVM to allow discards (e.g., balloon
inflation) while postcopy is active . You certainly don't want random
guest crashes. So either, we treat this as a fix (and backport) or as a
change in behavior/feature.

[...]

>>
>> 2. What will happen if I don't place a page on a pagefault, but only do a UFFDIO_WAKE?
>>    For now we were able to trigger a signal this way.
> 
> If I'm not mistaken the UFFDIO_WAKE will directly trigger the sigbus
> even without the help of the MADV_DONTNEED race.

Yes, that's the current way of injecting a SIGBUS instead of resolving
the pagefault. And AFAIKs, you're changing that behavior. (I am not
aware of a user, there could be use cases, but somehow it's strange to
get a signal when accessing memory that is mapped READ|WRITE and also
represented like this in e.g., /proc/$PID/maps). So IMHO, only the new
behavior makes really sense.

> 
>> If the behavior is changed, can
>>    we make this configurable via a UFFD_FEAT?
> 
> I'll still think that could be an overkill, but I'll leave the
> discussion to the experts.

I'll be happy to hear what Andrea Et al. think. At least I really want
to see the new behavior - and if it's not a fix, then I want some way to
detect if a kernel has this new (fixed?) behavior.

Thanks a lot for this work!

-- 
Thanks,

David / dhildenb