On 10/8/2018 3:26 AM, James Morse wrote:
Hi Jeffrey,
On 05/10/2018 17:39, Jeffrey Hugo wrote:
On 10/5/2018 9:54 AM, James Morse wrote:
The problem is generating these numbers if only some of the CPUs are online, or
if the acpi tables are generated by firmware at power-on and have a different
layout every time.
We don't even want to rely on linux's cpu numbering.
The suggestion here is to use the smallest MPIDR, as that's as hardware property
that won't change even if the tables are generated differently every boot.
I can't think of a reason why affinity level 0 would ever change for a
affinity level of what? The caches? Sure, that should be impossible to change.
particular thread or core (SMT vs non-SMT), however none of the other affinity
levels have a well defined meaning (implementation dependent), and could very
well change boot to boot.
Ah, you mean the level numbers. Yes, these are (quasi) discovered from the
table, so can't be relied on.
If you insert a new level then this would shuffle the user-visible indexes and
levels. I would argue this is no longer the same hardware.
Well, so its common in my experience for x86 server hardware to allow
the user to enable/disable SMT. This seems to allow the user to
configure the hardware to better suit their workloads.
Applying this to ARM, You might have a system where it starts in non-SMT
mode:
Aff3: 0
Aff2: 0
Aff1: 0
Aff0: X
however, after changing the setting to enable SMT and a reboot:
Aff3: 0
Aff2: 0
Aff1: Y
Aff0: X
If you base the cache ids on MPIDR, then they would likely change SMT vs
non-SMT for example. However, you seem to consider that as different
hardware, and therefore if the cache ids change, so be it.
Doing this may already break resctrl as the 'L2' and 'L3' numbers are part of
the ABI.
The ids generated would still be unique for a level.
I would strongly avoid using MPIDR, particularly for the usecase you've described.
Is there an alternative? It needs to be a hardware property to insulate us from
the tables being re-generated.
I agree the MPIDR numbers are likely to be ugly, (I don't really care...).
The u32 id being full from day 1 is more of a problem.
I agree, it needs to be a hardware property, and for the purposes of
uniquely identifying a core, I'm not seeing an alternative to MPIDR. I
think that MPIDR should be ok so long as we don't attempt to derive any
additional information from it, other than its an opaque value which
uniquely identifies a core (or thread).
Since you have a u32 cache id, and MPIDR is a u64, I'm not really
thrilled with trying to jam MPIDR into the cache id. Also, as you point
out, even though its not guaranteed, SW is going to expect the cache ids
are numbered 0, 1, 2, etc. I've seen issues with virtio where MPIDR
values were exposed via sysfs as unique CPU identifiers, and it ended up
preventing VMs from initializing on the system.
I suspect identifying the list of MPIDRs in the system, sorting them,
and then assigning cache ids based on the sorted list position would be
the best solution, ignoring that implementing it in PPTT parsing is
probably going to be painful.
Assuming two clusters in your example above, it would look like:
| CPU0/1 (cluster 0) L2 - 0x0
| CPU2/3 (cluster 1) L2 - 0x100
| L3 - 0x0
Thanks for the clarification. I think I've got enough to wrap my head around
this. Let me think on it a bit to see if I can come up with a suggestion (we
can debate how good it is).
Sounds good.
Thanks!
--
Jeffrey Hugo
Qualcomm Datacenter Technologies as an affiliate of Qualcomm
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.