Re: [PATCH v6 4/5] powerpc/64s: enable MMU_LAZY_TLB_SHOOTDOWN

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[ Adding a few more x86 and arm64 maintainers - while linux-arch is
the right mailing list, I'm not convinced people actually follow it
all that closely ]

On Wed, Jan 18, 2023 at 12:00 AM Nicholas Piggin <npiggin@xxxxxxxxx> wrote:
>
> On a 16-socket 192-core POWER8 system, a context switching benchmark
> with as many software threads as CPUs (so each switch will go in and
> out of idle), upstream can achieve a rate of about 1 million context
> switches per second, due to contention on the mm refcount.
>
> 64s meets the prerequisites for CONFIG_MMU_LAZY_TLB_SHOOTDOWN, so enable
> the option. This increases the above benchmark to 118 million context
> switches per second.

Well, the 1M -> 118M change does seem like a good reason for this series.

The patches certainly don't look offensive to me, so Ack as far as I'm
concerned, but honestly, it's been some time since I've personally
been active on the idle and lazy TLB code, so that ack is probably
largely worthless.

If anything, my main reaction to this all is to wonder whether the
config option is a good idea - maybe we could do this unconditionally,
and make the source code (and logic) simpler to follow when you don't
have to worry about the CONFIG_MMU_LAZY_TLB_REFCOUNT option.

I wouldn't be surprised to hear that x86 can have the same issue where
the mm_struct refcount is a bigger issue than the possibility of an
extra TLB shootdown at the final exit time.

But having the config options as a way to switch people over gradually
(and perhaps then removing it later) doesn't sound wrong to me either.

And I personally find the argument in patch 3/5 fairly convincing:

  Shootdown IPIs cost could be an issue, but they have not been observed
  to be a serious problem with this scheme, because short-lived processes
  tend not to migrate CPUs much, therefore they don't get much chance to
  leave lazy tlb mm references on remote CPUs.

Andy? PeterZ? Catalin?

Nick - it might be good to link to the actual benchmark, and let
people who have access to big machines perhaps just try it out on
non-powerpc platforms...

                   Linus




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux