Re: [PATCH] mm: Mark idle page tracking as BROKEN

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 16.06.21 21:23, Yu Zhao wrote:
On Wed, Jun 16, 2021 at 2:43 AM David Hildenbrand <david@xxxxxxxxxx> wrote:

On 16.06.21 10:36, Vlastimil Babka wrote:
On 6/16/21 8:22 AM, Yu Zhao wrote:
On Tue, Jun 15, 2021 at 8:55 PM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:


I don't know.  I asked the others on the call and the answer I got was
essentially "Just delete it".

I'm kind of hoping the others speak up.

I listed a couple of things when acking this patch. Being broken is
not a problem as long as there are users who care about it. What made
me think such users may not exist is that nobody ever complained about
those things until we stumbled on them -- I'm not insisting on
deleting this feature, just clarifying why I thought so.

Similar feelings here. On the call it looked like the feature was abandoned by
its creators, and it wasn't clear if the distros that had it enabled did so due
to reasons that still apply for future versions. Sending the proposal and
getting a feedback that there are users is one of the expected valid outcomes.

For us (RH) it will be very interesting to know the exact things that
are "suboptimal" (I'm avoiding the terminology "broken" here), so we can
actually evaluate if this might affect customers and might be worth
"improving".

I consider the examples I gave in my first email breakages -- others
broke/break the idle page tracking -- and I think it's safe to assume
they will continue to happen.

Right, just as with any other feature that has very bad (no?) upstream test coverage and doesn't immediately blow up if not done 100% right.


So to summarize (thanks for the input!):

1. It was really broken om arm64 before we had 07509e10dcc7 ("arm64: pgtable: Fix pte_accessible()") but should be working now.

2. Functions that call pte/pmd_mkold() but not test_and_clear_young() are shaky.

3. MADV_FREE'ed pages won't actually get freed and treated as if they were reaccessed, because page_referenced() will return true upon seeing PageYoung().

4. Huge page handling is suboptimal and requires proper care from user space to get it right: https://lore.kernel.org/linux-mm/20210614081610.16123-1-sjpark@xxxxxxxxx/


I suspect daemon will have similar interest in optimizing 2 and 3, right?


If you are really looking for improvements, the page compaction has
always been a good example. For the idle page tracking, with physical
memory as little as 4GB, it needs to go thru one million PFNs, no
matter how many compound or buddy pages there are. For THPs, it will
try to get_page_unless_zero() on tail pages, which always fails. This
is why we discussed it in the meeting.

Right, this sounds sub-optimal.


What can't be improved is the memory locality of PFNs. They are not
grouped by memcgs or processes. Two PFNs next to each other can be
from two processes with two sets of five-level page tables. The cache
misses simply outweigh any potential benefits one might get from this
feature, speaking as one of the customers.

Right.

--
Thanks,

David / dhildenb






[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux