On Mon, Feb 3, 2025 at 9:45 AM Alireza Sanaee <alireza.sanaee@xxxxxxxxxx> wrote: > > On Mon, 3 Feb 2025 08:47:50 -0600 > Rob Herring <robh@xxxxxxxxxx> wrote: > > > On Mon, Feb 3, 2025 at 6:05 AM Alireza Sanaee > > <alireza.sanaee@xxxxxxxxxx> wrote: > > > > > > For L1 cache to be shared between SMT threads, a register array > > > must be used. This, however, is not straightforward if every node > > > in the CPU map refers to a separate CPU node. Therefore, it is > > > suggested to create a separate CPU node for every SMT thread. The > > > L1 cache can be shared if an extra node represents it. > > > > I don't understand why a cpu-map is a problem for the SMT case? > > > > I don't think this change is necessary. > > > > Rob > > Hi Rob, > > I posted the following patch, which uses a reg array to represent > threads, allowing threads to share resources within a CPU > node using reg array and without requiring an extra l1-cache layer: > https://lore.kernel.org/all/20250110161057.445-1-alireza.sanaee@xxxxxxxxxx/ > > From Mark's remarks in the same patch, I learned that cpu-map object in > the dt will need each thread to point to a CPU node entry in > particular, (Documentation/devicetree/bindings/cpu/cpu-topology.txt). If > I use the reg array, each thread in the CPU map will not be able to > point to the corresponding CPU node as they are in the reg array. > > You might argue that CPU maps should also be able to be built based on > the threads in the reg array, and I actually agree with that. Maybe > that's something I should go about in that case. The CPU binding in the spec predates cpu-topology.txt. Yes, that was originally written for PowerPC, but there's really no good reason for other architectures to deviate. L1 caches are not the only thing shared. There's clocks, power-domains, OPPs, etc. The CPU node parsing functions (e.g. of_get_cpu_node()) are also already designed for threads to share a CPU node. IMO, we should follow the spec. For cpu-map, there's 2 choices if there is 1 CPU node for all shared threads: - Don't describe threads in the map and use 'reg' to get any thread info. - Make shared threads in the map point to the same CPU node. The latter option could be: core0 { thread0 { cpu = <&cpu0>; }; thread1 { cpu = <&cpu0>; }; }; Or: core0 { thread0 { cpu = <&cpu0 0 0>; }; thread1 { cpu = <&cpu0 0 1>; }; }; Where "0 0" and "0 1" match the 'reg' address of the thread. Rob