Re: [PATCH 0/2] fix for "pathological THP behavior"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Aug 20, 2018 at 4:58 AM, Kirill A. Shutemov
<kirill@xxxxxxxxxxxxx> wrote:
> On Sun, Aug 19, 2018 at 11:22:02PM -0400, Andrea Arcangeli wrote:
>> Hello,
>>
>> we detected a regression compared to older kernels, only happening
>> with defrag=always or by using MADV_HUGEPAGE (and QEMU uses it).
>>
>> I haven't bisected but I suppose this started since commit
>> 5265047ac30191ea24b16503165000c225f54feb combined with previous
>> commits that introduced the logic to not try to invoke reclaim for THP
>> allocations in the remote nodes.
>>
>> Once I looked into it the problem was pretty obvious and there are two
>> possible simple fixes, one is not to invoke reclaim and stick to
>> compaction in the local node only (still __GFP_THISNODE model).
>>
>> This approach keeps the logic the same and prioritizes for NUMA
>> locality over THP generation.
>>
>> Then I'll send the an alternative that drops the __GFP_THISNODE logic
>> if_DIRECT_RECLAIM is set. That however changes the behavior for
>> MADV_HUGEPAGE and prioritizes THP generation over NUMA locality.
>>
>> A possible incremental improvement for this __GFP_COMPACT_ONLY
>> solution would be to remove __GFP_THISNODE (and in turn
>> __GFP_COMPACT_ONLY) after checking the watermarks if there's no free
>> PAGE_SIZEd memory in the local node. However checking the watermarks
>> in mempolicy.c is not ideal so it would be a more messy change and
>> it'd still need to use __GFP_COMPACT_ONLY as implemented here for when
>> there's no PAGE_SIZEd free memory in the local node. That further
>> improvement wouldn't be necessary if there's agreement to prioritize
>> THP generation over NUMA locality (the alternative solution I'll send
>> in a separate post).
>
> I personally prefer to prioritize NUMA locality over THP
> (__GFP_COMPACT_ONLY variant), but I don't know page-alloc/compaction good
> enough to Ack it.


May the approach #1 break the setting of zone_reclaim_mode? Or it may
behave like zone_reclaim_mode is set even though the knob is cleared?

Thanks,
Yang

>
> --
>  Kirill A. Shutemov
>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux