Re: [PATCH 2/8] qapi/qom: Introduce smp-cache object

Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx> · Thu, 25 Jul 2024 11:59:02 +0100

On Thu, 25 Jul 2024 11:50:59 +0100
Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx> wrote:

Resending as this bounced due (I think) to an address typo.

> Hi Markus, Zhao Liu
> 
> From the ARM server side this is something I want to see as well.
> So I can comment on why we care.
> 
> > >> This series adds a way to configure caches.
> > >> 
> > >> Structure of the configuration data: a list
> > >> 
> > >>     [{"name": N, "topo": T}, ...]
> > >> 
> > >> where N can be "l1d", "l1i", "l2", or "l3",
> > >>   and T can be "invalid", "thread", "core", "module", "cluster",
> > >>                "die", "socket", "book", "drawer", or "default".
> > >> 
> > >> What's the use case?  The commit messages don't tell.    
> > >
> > > i386 has the default cache topology model: l1 per core/l2 per core/l3
> > > per die.
> > >
> > > Cache topology affects scheduler performance, e.g., kernel's cluster
> > > scheduling.
> > >
> > > Of course I can hardcode some cache topology model in the specific cpu
> > > model that corresponds to the actual hardware, but for -cpu host/max,
> > > the default i386 cache topology model has no flexibility, and the
> > > host-cpu-cache option doesn't have enough fine-grained control over the
> > > cache topology.
> > >
> > > So I want to provide a way to allow user create more fleasible cache
> > > topology. Just like cpu topology.    
> > 
> > 
> > So the use case is exposing a configurable cache topology to the guest
> > in order to increase performance.  Performance can increase when the
> > configured virtual topology is closer to the physical topology than a
> > default topology would be.  This can be the case with CPU host or max.
> > 
> > Correct?  
> 
> That is definitely why we want it on arm64 where this info fills in
> the topology we can't get from the CPU registers.
> (we should have patches on top of this to send out shortly).
> 
> As a side note we also need this for MPAM emulation for TCG
> (any maybe eventually paravirtualized MPAM) as this is needed
> to build the right PPTT to describe the caches which we then
> query to figure out association of MPAM controls with particularly
> caches.
> 
> Size configuration is something we'll need down the line (presenting
> only part of an L3 may make sense if it's shared by multiple VMs
> or partitioned with MPAM) but that's a future question.
> 
> 
> >   
> > >> Why does that use case make no sense without SMP?    
> > >
> > > As the example I mentioned, for Intel hyrbid architecture, P cores has
> > > l2 per core and E cores has l2 per module. Then either setting the l2
> > > topology level as core nor module, can emulate the real case.
> > >
> > > Even considering the more extreme case of Intel 14th MTL CPU, where
> > > some E cores have L3 and some don't even have L3. As well as the last
> > > time you and Daniel mentioned that in the future we could consider
> > > covering more cache properties such as cache size. But the l3 size can
> > > be different in the same system, like AMD's x3D technology. So
> > > generally configuring properties for @name in a list can't take into
> > > account the differences of heterogeneous caches with the same @name.
> > >
> > > Hope my poor english explains the problem well. :-)    
> > 
> > I think I understand why you want to configure caches.  My question was
> > about the connection to SMP.
> > 
> > Say we run a guest with a single core, no SMP.  Could configuring caches
> > still be useful then?  
> 
> Probably not useful to configure topology (sizes are a separate question)
> - any sensible default should be fine.
> 
> Jonathan
> 
>