On Thu, 25 Jul 2024 11:50:59 +0100 Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx> wrote: Resending as this bounced due (I think) to an address typo. > Hi Markus, Zhao Liu > > From the ARM server side this is something I want to see as well. > So I can comment on why we care. > > > >> This series adds a way to configure caches. > > >> > > >> Structure of the configuration data: a list > > >> > > >> [{"name": N, "topo": T}, ...] > > >> > > >> where N can be "l1d", "l1i", "l2", or "l3", > > >> and T can be "invalid", "thread", "core", "module", "cluster", > > >> "die", "socket", "book", "drawer", or "default". > > >> > > >> What's the use case? The commit messages don't tell. > > > > > > i386 has the default cache topology model: l1 per core/l2 per core/l3 > > > per die. > > > > > > Cache topology affects scheduler performance, e.g., kernel's cluster > > > scheduling. > > > > > > Of course I can hardcode some cache topology model in the specific cpu > > > model that corresponds to the actual hardware, but for -cpu host/max, > > > the default i386 cache topology model has no flexibility, and the > > > host-cpu-cache option doesn't have enough fine-grained control over the > > > cache topology. > > > > > > So I want to provide a way to allow user create more fleasible cache > > > topology. Just like cpu topology. > > > > > > So the use case is exposing a configurable cache topology to the guest > > in order to increase performance. Performance can increase when the > > configured virtual topology is closer to the physical topology than a > > default topology would be. This can be the case with CPU host or max. > > > > Correct? > > That is definitely why we want it on arm64 where this info fills in > the topology we can't get from the CPU registers. > (we should have patches on top of this to send out shortly). > > As a side note we also need this for MPAM emulation for TCG > (any maybe eventually paravirtualized MPAM) as this is needed > to build the right PPTT to describe the caches which we then > query to figure out association of MPAM controls with particularly > caches. > > Size configuration is something we'll need down the line (presenting > only part of an L3 may make sense if it's shared by multiple VMs > or partitioned with MPAM) but that's a future question. > > > > > > >> Why does that use case make no sense without SMP? > > > > > > As the example I mentioned, for Intel hyrbid architecture, P cores has > > > l2 per core and E cores has l2 per module. Then either setting the l2 > > > topology level as core nor module, can emulate the real case. > > > > > > Even considering the more extreme case of Intel 14th MTL CPU, where > > > some E cores have L3 and some don't even have L3. As well as the last > > > time you and Daniel mentioned that in the future we could consider > > > covering more cache properties such as cache size. But the l3 size can > > > be different in the same system, like AMD's x3D technology. So > > > generally configuring properties for @name in a list can't take into > > > account the differences of heterogeneous caches with the same @name. > > > > > > Hope my poor english explains the problem well. :-) > > > > I think I understand why you want to configure caches. My question was > > about the connection to SMP. > > > > Say we run a guest with a single core, no SMP. Could configuring caches > > still be useful then? > > Probably not useful to configure topology (sizes are a separate question) > - any sensible default should be fine. > > Jonathan > >