Re: lockref scalability on x86-64 vs cpu_relax

Mateusz Guzik <mjguzik@xxxxxxxxx> · Fri, 13 Jan 2023 02:12:50 +0100

On 1/13/23, Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> Side note on your access() changes - if it turns out that you can
> remove all the cred games, we should possibly then revert my old
> commit d7852fbd0f04 ("access: avoid the RCU grace period for the
> temporary subjective credentials") which avoided the biggest issue
> with the unnecessary cred switching.
>
> I *think* access() is the only user of that special 'non_rcu' thing,
> but it is possible that the whole 'non_rcu' thing ends up mattering
> for cases where the cred actually does change because euid != uid (ie
> suid programs), so this would need a bit more effort to do performance
> testing on.
>

I don't think the games are avoidable. For one I found non-root
processes with non-empty cap_effective even on my laptop, albeit I did
not check how often something like this is doing access().

Discussion for another time.

> On Thu, Jan 12, 2023 at 5:36 PM Mateusz Guzik <mjguzik@xxxxxxxxx> wrote:
>> All that said, I think the thing to do here is to replace cpu_relax
>> with a dedicated arch-dependent macro, akin to the following:
>
> I would actually prefer just removing it entirely and see if somebody
> else hollers. You have the numbers to prove it hurts on real hardware,
> and I don't think we have any numbers to the contrary.
>
> So I think it's better to trust the numbers and remove it as a
> failure, than say "let's just remove it on x86-64 and leave everybody
> else with the potentially broken code"
>
[snip]
> Then other architectures can try to run their numbers, and only *if*
> it then turns out that they have a reason to do something else should
> we make this conditional and different on different architectures.
>
> Let's try to keep the code as common as possibly until we have hard
> evidence for special cases, in other words.
>

I did not want to make such a change without redoing the ThunderX2
benchmark, or at least something else arm64-y. I may be able to bench it
tomorrow on whatever arm-y stuff can be found on Amazon's EC2, assuming
no arm64 people show up with their results.

Even then IMHO the safest route is to patch it out on x86-64 and give
other people time to bench their archs as they get around to it, and
ultimately whack the thing if it turns out nobody benefits from it.
I would say beats backpedaling on the removal, but I'm not going to
fight for it.

That said, does waiting for arm64 numbers and/or producing them for the
removal commit message sound like a plan? If so, I'll post soon(tm).

-- 
Mateusz Guzik <mjguzik gmail.com>