No strong opinion, something synchronous sounds to me like the
low-hanging fruit, that could add the infrastructure to be used by
something more advanced/synchronously :)
Got it, I will try to do the following in the next version.
a. for MADV_DONTNEED case, try synchronous reclaim as you said
I think that really is the low hanging fruit that would cover quite some
cases already: (1) reclaim when MADV_DONTNEED spans the complete page table.
Then, there is (2) reclaim when MADV_DONTNEED spans only part of the
page table (e.g., single PTE), but my best guess is that it's better to
scan for that asynchronously than making possibly each MADV_DONTNEED
sycall invocation slower.
(1) would already help a lot and showcase how the locking/machinery
would work.
b. for MADV_FREE case:
- add a madvise option for synchronous reclaim
- add another madvise option to mark the vma, then add its
corresponding mm to a global list, and then traverse
the list and reclaim it when the memory is tight and
enters the system reclaim path.
(maybe there is an option to unmark)
c. for s390 case you mentioned, create a CONFIG_FREE_PT first, and
then s390 will not select this config until the problem is solved.
d. for lockless scan, try using disabling IRQ or (mmap read lock +
pte_offset_map_nolock).
Although d) really only is desired when scanning asynchronously I think.
During (1) above, we know that the table will be very likely empty
(unless weird race).
--
Cheers,
David / dhildenb