Re: [PATCH v4 10/21] i386: Introduce module-level cpu topology to CPUX86State

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Philippe,

On Thu, Sep 14, 2023 at 09:38:46AM +0200, Philippe Mathieu-Daudé wrote:
> Date: Thu, 14 Sep 2023 09:38:46 +0200
> From: Philippe Mathieu-Daudé <philmd@xxxxxxxxxx>
> Subject: Re: [PATCH v4 10/21] i386: Introduce module-level cpu topology to
>  CPUX86State
> 
> On 14/9/23 09:21, Zhao Liu wrote:
> > From: Zhuocheng Ding <zhuocheng.ding@xxxxxxxxx>
> > 
> > smp command has the "clusters" parameter but x86 hasn't supported that
> > level. "cluster" is a CPU topology level concept above cores, in which
> > the cores may share some resources (L2 cache or some others like L3
> > cache tags, depending on the Archs) [1][2]. For x86, the resource shared
> > by cores at the cluster level is mainly the L2 cache.
> > 
> > However, using cluster to define x86's L2 cache topology will cause the
> > compatibility problem:
> > 
> > Currently, x86 defaults that the L2 cache is shared in one core, which
> > actually implies a default setting "cores per L2 cache is 1" and
> > therefore implicitly defaults to having as many L2 caches as cores.
> > 
> > For example (i386 PC machine):
> > -smp 16,sockets=2,dies=2,cores=2,threads=2,maxcpus=16 (*)
> > 
> > Considering the topology of the L2 cache, this (*) implicitly means "1
> > core per L2 cache" and "2 L2 caches per die".
> > 
> > If we use cluster to configure L2 cache topology with the new default
> > setting "clusters per L2 cache is 1", the above semantics will change
> > to "2 cores per cluster" and "1 cluster per L2 cache", that is, "2
> > cores per L2 cache".
> > 
> > So the same command (*) will cause changes in the L2 cache topology,
> > further affecting the performance of the virtual machine.
> > 
> > Therefore, x86 should only treat cluster as a cpu topology level and
> > avoid using it to change L2 cache by default for compatibility.
> > 
> > "cluster" in smp is the CPU topology level which is between "core" and
> > die.
> > 
> > For x86, the "cluster" in smp is corresponding to the module level [2],
> > which is above the core level. So use the "module" other than "cluster"
> > in i386 code.
> > 
> > And please note that x86 already has a cpu topology level also named
> > "cluster" [3], this level is at the upper level of the package. Here,
> > the cluster in x86 cpu topology is completely different from the
> > "clusters" as the smp parameter. After the module level is introduced,
> > the cluster as the smp parameter will actually refer to the module level
> > of x86.
> > 
> > [1]: 864c3b5c32f0 ("hw/core/machine: Introduce CPU cluster topology support")
> > [2]: Yanan's comment about "cluster",
> >       https://lists.gnu.org/archive/html/qemu-devel/2023-02/msg04051.html
> > [3]: SDM, vol.3, ch.9, 9.9.1 Hierarchical Mapping of Shared Resources.
> > 
> > Signed-off-by: Zhuocheng Ding <zhuocheng.ding@xxxxxxxxx>
> > Co-developed-by: Zhao Liu <zhao1.liu@xxxxxxxxx>
> > Signed-off-by: Zhao Liu <zhao1.liu@xxxxxxxxx>
> > Acked-by: Michael S. Tsirkin <mst@xxxxxxxxxx>
> > ---
> > Changes since v1:
> >   * The background of the introduction of the "cluster" parameter and its
> >     exact meaning were revised according to Yanan's explanation. (Yanan)
> > ---
> >   hw/i386/x86.c     | 1 +
> >   target/i386/cpu.c | 1 +
> >   target/i386/cpu.h | 5 +++++
> >   3 files changed, 7 insertions(+)
> 
> 
> > diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> > index 470257b92240..556e80f29764 100644
> > --- a/target/i386/cpu.h
> > +++ b/target/i386/cpu.h
> > @@ -1903,6 +1903,11 @@ typedef struct CPUArchState {
> >       /* Number of dies within this CPU package. */
> >       unsigned nr_dies;
> > +    /*
> > +     * Number of modules within this CPU package.
> > +     * Module level in x86 cpu topology is corresponding to smp.clusters.
> > +     */
> > +    unsigned nr_modules;
> >   } CPUX86State;
> 
> It would be really useful to have an ASCII art comment showing
> the architecture topology.

Good idea, I could consider how to show that.

> Also for clarity the topo fields from
> CPU[Arch]State could be moved into a 'topo' sub structure, or even
> clearer would be to re-use the X86CPUTopoIDs structure?

Yeah, I also have the plan to do further cleanup about these topology
structures [1]. X86CPUTopoInfo is not suitable for hybrid topology case,
(hybrid case needs another structure to store the max number elements
for each level), so I will try to move that X86CPUTopoInfo into
CPU[Arch]State.

[1]: https://lists.gnu.org/archive/html/qemu-devel/2023-08/msg01032.html

Thanks,
Zhao



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux