Re: Problem of host CPU topology parsing

Jiri Denemark <jdenemar@xxxxxxxxxx> · Fri, 11 May 2012 12:07:54 +0200

On Fri, May 11, 2012 at 10:53:12 +0100, Daniel P. Berrange wrote:
> On Fri, May 11, 2012 at 05:42:34PM +0800, Osier Yang wrote:
> > On 2012年05月11日 17:01, Jiri Denemark wrote:
> > >On Fri, May 11, 2012 at 10:47:06 +0200, Michal Privoznik wrote:
> > >>On 11.05.2012 10:40, Osier Yang wrote:
> > >>>     /* nodeinfo->sockets is supposed to be a number of sockets per NUMA
> > >>>node,
> > >>>      * however if NUMA nodes are not composed of whole sockets, we just lie
> > >>>      * about the number of NUMA nodes and force apps to check
> > >>>capabilities XML
> > >>>      * for the actual NUMA topology.
> > >>>      */
> > >>>     if (nodeinfo->sockets % nodeinfo->nodes == 0)
> > >>>         nodeinfo->sockets /= nodeinfo->nodes;
> > >>>     else
> > >>>         nodeinfo->nodes = 1;
> > >>>
> > >>>Jirka said this was for a fix, but I don't quite understand it,
> > >>>what does the "nodeinfo.nodes" mean actually? Shouldn't it
> > >>>be 8 (for the 48 CPUs machine) instead? But then we will be
> > >>>wrong again with using VIR_NODEINFO_MAXCPUS.
> > >>
> > >>Why do you think it will be wrong? My understanding is that
> > >>VIR_NODEINFO_MAXCPUS just tell the max number of possible cpus not the
> > >>actual. So if it's over 48 we are safe.
> > >
> > >Not really, the macro should count exactly the number of CPUs available to
> > >host, otherwise lots of other issues (incl. backward compatibility) appear. It
> > >is just a badly named macro that should never exist but we can't do anything
> > >with it since it is our public API.
> > >
> > >>Btw: the code above seems like a hack to me.
> > >
> > >Yes, it is a hack but it's unfortunately required because we can't change the
> > >macro.
> > >
> > >Anyway, I agree with Daniel that the bug most likely lies somewhere in the
> > >code that populates nodeinfo structure.
> > >
> > >Jirka
> > 
> > In /proc/cpuinfo:
> > 
> > <snip>
> > cpu cores       : 12
> > </snip>
> > 
> > However, there are only 6 core IDs, as showed in
> > http://fpaste.org/mtoA/. And we parse the core_id
> > file of each CPU as:
> > 
> >         core = parse_core(cpu);
> >         if (!CPU_ISSET(core, &core_mask)) {
> >             CPU_SET(core, &core_mask);
> >             nodeinfo->cores++;
> >         }
> > 
> > and thus get only 6 cores. Don't known how 12 in /proc/cpuinfo
> > is figured out. But could it be a clue?
> 
> Ahhh.  The AMD 12 "core" CPUs are in fact a pair of 6 core CPUs
> with 2 NUMA nodes in the CPU itself.

Oh, so the problem is that two 6-core CPUs share the same socket and thus have
the same physical ID. So it's either 8 6-core CPUs or 4 12-core CPUs. Not
sure which one is better to present. The first one is the real thing and the
second one is how AMD presents the reality :-) Anyway, we should do something
with

        /* Parse core */
        core = parse_core(cpu);
        if (!CPU_ISSET(core, &core_mask)) {
            CPU_SET(core, &core_mask);
            nodeinfo->cores++;
        }

        /* Parse socket */
        sock = parse_socket(cpu);
        if (!CPU_ISSET(sock, &socket_mask)) {
            CPU_SET(sock, &socket_mask);
            nodeinfo->sockets++;
        }

which just ignores duplicate physical/core IDs. I feel like this was added
there for some reason, though...

Jirka

--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list