Re: [RFC PATCH 0/6] mm: thp: use generic THP migration for NUMA hinting fault

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 29 Mar 2021 11:33:06 -0700
Yang Shi <shy828301@xxxxxxxxx> wrote:

> 
> When the THP NUMA fault support was added THP migration was not supported yet.
> So the ad hoc THP migration was implemented in NUMA fault handling.  Since v4.14
> THP migration has been supported so it doesn't make too much sense to still keep
> another THP migration implementation rather than using the generic migration
> code.  It is definitely a maintenance burden to keep two THP migration
> implementation for different code paths and it is more error prone.  Using the
> generic THP migration implementation allows us remove the duplicate code and
> some hacks needed by the old ad hoc implementation.
> 
> A quick grep shows x86_64, PowerPC (book3s), ARM64 ans S390 support both THP
> and NUMA balancing.  The most of them support THP migration except for S390.
> Zi Yan tried to add THP migration support for S390 before but it was not
> accepted due to the design of S390 PMD.  For the discussion, please see:
> https://lkml.org/lkml/2018/4/27/953.
> 
> I'm not expert on S390 so not sure if it is feasible to support THP migration
> for S390 or not.  If it is not feasible then the patchset may make THP NUMA
> balancing not be functional on S390.  Not sure if this is a show stopper although
> the patchset does simplify the code a lot.  Anyway it seems worth posting the
> series to the mailing list to get some feedback.

The reason why THP migration cannot work on s390 is because the migration
code will establish swap ptes in a pmd. The pmd layout is very different from
the pte layout on s390, so you cannot simply write a swap pte into a pmd.
There are no separate swp primitives for swap/migration pmds, IIRC. And even
if there were, we'd still need to find some space for a present bit in the
s390 pmd, and/or possibly move around some other bits.

A lot of things can go wrong here, even if it could be possible in theory,
by introducing separate swp primitives in common code for pmd entries, along
with separate offset, type, shift, etc. I don't see that happening in the
near future.

Not sure if this is a show stopper, but I am not familiar enough with
NUMA and migration code to judge. E.g., I do not see any swp entry action
in your patches, but I assume this is implicitly triggered by the switch
to generic THP migration code.

Could there be a work-around by splitting THP pages instead of marking them
as migrate pmds (via pte swap entries), at least when THP migration is not
supported? I guess it could also be acceptable if THP pages were simply not
migrated for NUMA balancing on s390, but then we might need some extra config
option to make that behavior explicit.

See also my comment on patch #5 of this series.

Regards,
Gerald




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux