Re: [PATCH v2] x86: bring back rep movsq for user access on CPUs without ERMS

Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> · Wed, 30 Aug 2023 09:50:46 -0700

On Wed, 30 Aug 2023 at 07:03, Mateusz Guzik <mjguzik@xxxxxxxxx> wrote:
>
> Hand-rolled mov loops executing in this case are quite pessimal compared
> to rep movsq for bigger sizes. While the upper limit depends on uarch,
> everyone is well south of 1KB AFAICS and sizes bigger than that are
> common.
>
> While technically ancient CPUs may be suffering from rep usage, gcc has
> been emitting it for years all over kernel code, so I don't think this
> is a legitimate concern.
>
> Sample result from read1_processes from will-it-scale (4KB reads/s):
> before: 1507021
> after:  1721828 (+14%)

Ok, patch looks fine to me now.

So I applied this directly to my tree, since I was the one doing the
x86 memcpy cleanups that removed the REP_GOOD hackery anyway.

              Linus