Re: [PATCH 0/1] soft_dirty: fix soft_dirty during THP split

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 08/19/2016 05:37 PM, Andrea Arcangeli wrote:
> On Fri, Aug 19, 2016 at 04:52:51PM +0300, Pavel Emelyanov wrote:
>> Hm... Are you talking about some in-kernel test, or just any? We have
>> tests in CRIU tree for UFFD (not sure we've wired up the non-cooperative
>> part though).
> 
> Nice. I wasn't aware you had uffd specific tests in CRIU, I'll check.
> 
> I was referring to the tools/testing/selftest/vm/userfault*, but I
> suppose it's fine in CIRU as well. A self contained test suitable for
> testing/selftest would be nice too as not everyone will run CRIU tests
> to test the kernel.
> 
> Currently what's tested is anon missing, tmpfs missing and hugetlbfs
> missing and they all work (just fixed two tmpfs bugs yesterday thanks
> to the tmpfs test that crashed my workstation when I tried it, now it
> passes fine :).
> 
>> And my main worry about this is COW-sharing. If we have two tasks that
>> fork()-ed from each other and we try to lazily restore a page that
>> is still COW-ed between them, the uffd API doesn't give us anything to
>> do it. So we effectively break COW on lazy restore. Do you have any
>> ideas what can be done about it?
> 
> Building a shared page is tricky, not even khugepaged was doing that
> for anon.
> 
> Kirill extended khugepaged to do it, along the THP on tmpfs support,
> as it's more important for tmpfs (I haven't yet checked if it landed
> upstream with the rest of tmpfs in 4.8-rc though).
> 
> The main API problem is the uffd is different between parent and
> child, fork with your non cooperative patches gives you a new uffd
> that represents the child mm.

Yes.

> To create a shared page among two "mm" the API should be able to
> specify the two "mm" and two "addresses" atomically in the same
> ioctl. And the uffd _is_ the "mm" with the current API.

Well, with current approach mm equals uffd file, so passing
one uffd descriptor into another's ioctl should do the trick.

> So what it takes to do it is to add a UFFDIO_COPY_COW that takes as
> parameter an address for the current "uffd" and a list of "int uffd,
> unsigned long address" pairs.

Yup :)

> Even with the UFFDIO_COPY things should still work solid, it'll just
> take more memory and it'll break-COW during restore. The important
> thing is "break" is as in "allocate more memory", not as in "crashing" :).
> 
>> We have ... readiness to do it :) since once CRIU hits this we'll have to.
> 
> Ok great.
> 
> I also thought about it a bit and I think it's just a matter of
> specifying which uffd should get the notification first. The manager
> then will take the notification first and it will call an
> UFFDIO_FAULT_PASS to cascade in the second uffd registered in the
> region if the page was missing in the source container, without waking
> up the task blocked in handle_userfault. To find the page is missing
> in the source container you could use pagemap.
> 
> Thanks,
> Andrea
> .
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]