Re: [RFC PATCH 0/3] asynchronously scan and free empty user PTE pages

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 18.06.24 09:51, Qi Zheng wrote:
Hi David,

On 2024/6/18 01:49, David Hildenbrand wrote:


No strong opinion, something synchronous sounds to me like the
low-hanging fruit, that could add the infrastructure to be used by
something more advanced/synchronously :)

Got it, I will try to do the following in the next version.

a. for MADV_DONTNEED case, try synchronous reclaim as you said


I think that really is the low hanging fruit that would cover quite some
cases already: (1) reclaim when MADV_DONTNEED spans the complete page
table.

I will check and free the PTE page in the zap_pte_range() if the
(end - addr >= PMD_SIZE) condition is met.


Then, there is (2) reclaim when MADV_DONTNEED spans only part of the
page table (e.g., single PTE), but my best guess is that it's better to
scan for that asynchronously than making possibly each MADV_DONTNEED
sycall invocation slower.

Maybe just mark the vma, and then scan it in the system reclaim path.

I also plan to do this in the MADV_FREE case, instead of adding an
asynchronous madvise option first.


(1) would already help a lot and showcase how the locking/machinery
would work.


b. for MADV_FREE case:

     - add a madvise option for synchronous reclaim

     - add another madvise option to mark the vma, then add its
             corresponding mm to a global list, and then traverse
             the list and reclaim it when the memory is tight and
             enters the system reclaim path.
             (maybe there is an option to unmark)

c. for s390 case you mentioned, create a CONFIG_FREE_PT first, and
      then s390 will not select this config until the problem is solved.

d. for lockless scan, try using disabling IRQ or (mmap read lock +
pte_offset_map_nolock).

Although d) really only is desired when scanning asynchronously I think.
During (1) above, we know that the table will be very likely empty
(unless weird race).

Agree.

Again, thanks for working on this. Let me know (can also do privately) if you run into any issues or think I can be of help. :)

--
Cheers,

David / dhildenb





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux