On Fri, 10 Jan 2025 17:25:50 +0000 Mark Rutland <mark.rutland@xxxxxxx> wrote: Hi Mark, Just resending, but without the screenshot mistakenly attached to the other email. Sorry about that. > On Fri, Jan 10, 2025 at 05:02:11PM +0000, Alireza Sanaee wrote: > > On Fri, 10 Jan 2025 16:23:00 +0000 > > Mark Rutland <mark.rutland@xxxxxxx> wrote: > > > > Hi Mark, > > > > Thanks for prompt feedback. > > > > Please look inline. > > > > > On Fri, Jan 10, 2025 at 04:10:57PM +0000, Alireza Sanaee wrote: > > > > Update `of_parse_and_init_cpus` to parse reg property of CPU > > > > node as an array based as per spec for SMT threads. > > > > > > > > Spec v0.4 Section 3.8.1: > > > > > > Which spec, and why do we care? > > > > For the spec, this is what I looked > > into https://github.com/devicetree-org/devicetree-specification/releases/download/v0.4/devicetree-specification-v0.4.pdf > > Section 3.8.1 > > > > Sorry I didn't put the link in there. > > Ok, so that's "The devicetree specification v0.4 from ${URL}", rather > than "Spec v0.4". :) sure, I will be more precise in my future correspondences. > > > One limitation with the existing approach is that it is not really > > possible to describe shared caches for SMT cores as they will be > > seen as separate CPU cores in the device tree. Is there anyway to > > do so? > > Can't the existing cache bindings handle that? e.g. give both threads > a next-level-cache pointing to the shared L1? Unfortunately, I have tested this recently, there are some leg work to be able to even enable that, and does not work right now. > > > More discussion over sharing caches for threads > > here https://lore.kernel.org/kvm/20241219083237.265419-1-zhao1.liu@xxxxxxxxx/ > > In that thread Rob refers to earlier discussions, so I don't think > that thread alone has enough context. https://lore.kernel.org/linux-devicetree/CAL_JsqLGEvGBQ0W_B6+5cME1UEhuKXadBB-6=GoN1tmavw9K_w@xxxxxxxxxxxxxx/ This was the earlier discussion, where Rob pointed me towards investigating this approach (this patch). > > > > > The value of reg is a <prop-encoded-**array**> that defines a > > > > unique CPU/thread id for the CPU/threads represented by the CPU > > > > node. **If a CPU supports more than one thread (i.e. multiple > > > > streams of execution) the reg property is an array with 1 > > > > element per thread**. The address-cells on the /cpus node > > > > specifies how many cells each element of the array takes. > > > > Software can determine the number of threads by dividing the > > > > size of reg by the parent node's address-cells. > > > > > > We already have systems where each thread gets a unique CPU node > > > under /cpus, so we can't rely on this to determine the topology. > > > > I assume we can generate unique values even in reg array, but > > probably makes things more complicated. > > The other bindings use phandles to refer to threads, and phandles > point to nodes in the dt, so it's necessary for threads to be given > separate nodes. > > Note that the CPU topology bindings use that to describe threads, see > > Documentation/devicetree/bindings/cpu/cpu-topology.txt Noted. Makes sense. > > > > Further, there are bindings which rely on being able to address > > > each CPU/thread with a unique phandle (e.g. for affinity of PMU > > > interrupts), which this would break. > > > > Regardless, as above I do not think this is a good idea. While it > > > allows the DT to be written in a marginally simpler way, it makes > > > things more complicated for the kernel and is incompatible with > > > bindings that we already support. > > > > > > If anything "the spec" should be relaxed here. > > > > Hi Rob, > > > > If this approach is too disruptive, then shall we fallback to the > > approach where go share L1 at next-level-cache entry? > > Ah, was that previously discussed, and were there any concerns against > that approach? > > To be clear, my main concern here is that threads remain represented > as distinct nodes under /cpus; I'm not wedded to the precise solution > for representing shared caches. This was basically what comes to mind as a non-invasive preliminary solution. That said there were no discussions over downsides or advantages of having a separate layer for l1-cache YET. But if it is something reasonable, I can look into it. > > Mark. > > Thanks, Alireza