On Mon, Nov 07, 2011 at 11:35:27AM -0700, Eric Blake wrote: > On 11/06/2011 06:57 AM, Bharata B Rao wrote: > >XML definitions for guest NUMA and parsing routines. > > > >From: Bharata B Rao<bharata@xxxxxxxxxxxxxxxxxx> > > > >This patch adds XML definitions for guest NUMA specification and contains > >routines to parse the same. The guest NUMA specification looks like this: > > > ><cpu> > > ... > > <topology sockets='2' cores='4' threads='2'/> > > <numa> > > <cell cpus='0-7' mems='512000'/> > > <cell cpus='8-15' mems='512000'/> > > </numa> > > ... > ></cpu> > > > >Signed-off-by: Bharata B Rao<bharata@xxxxxxxxxxxxxxxxxx> > >--- > > >+<p> > >+ Guest NUMA topology can be specifed using<code>numa</code> element. > >+<span class="since">Since X.X.X</span> > > Let's just put 0.9.8 here. It's easier at feature freeze time to > grep and replace a concrete 0.9.8 into a different numbering scheme > (0.10.0, 1.0.0, ?) if we decide on something different than 0.9.8, > than it is to remember to also search for X.X.X. Ok, changed to 0.9.8 in v3. > > >+</p> > >+ > >+<pre> > >+ ... > >+<cpu> > >+ ... > >+<numa> > >+<cell cpus='0-3' mems='512000'/> > >+<cell cpus='4-7' mems='512000'/> > > I understand 'cpus' (a valid word meaning multiple cpu units), but > 'mems' seems odd; I think it would be better naming this attribute > 'memory' to match our <memory> element at the top level. Just > because qemu's command line names the option mems= doesn't mean we > should be stuck making our XML inconsistent. Changed to memory. > > >+</numa> > >+ ... > >+</cpu> > >+ ...</pre> > >+ > >+<p> > >+ Each<code>cell</code> element specifies a NUMA cell or a NUMA node. > >+<code>cpus</code> specifies the CPU or range of CPUs that are part of > >+ the node.<code>mems</code> specifies the node memory in kilobytes > >+ (i.e. blocks of 1024 bytes). Each cell or node is assigned cellid > >+ or nodeid in the increasing order starting from 0. > > I agree with doing things in 1024-byte blocks [1], since <memory> > and <currentMemory> are also in that unit. > > [1] 1024-byte blocks is technically kibibytes, not kilobytes; but > you're copying from existing text, so at least we're consistent, not > to mention fitting right in with the wikipedia complaint that KiB > has had slow adoption by the computer industry: > https://secure.wikimedia.org/wikipedia/en/wiki/Kibibyte :) > > >+</p> > >+ > >+<p> > >+ This guest NUMA specification translates to<code>-numa</code> command > >+ line option for QEMU/KVM. For the above example, the following QEMU > >+ command line option is generated: > >+<code>-numa node,nodeid=0,cpus=0-3,mems=512000 -numa node,nodeid=1,cpus=4-7,mems=512000</code> > > This paragraph is not necessary. We don't need to give one > hypervisor-specific example of how the XML is translated; it is > sufficient to simply document the XML semantics in a way that can be > implemented by any number of hypervisors. Ok, removed this paragraph. > > >+ > >+<define name="numaCell"> > >+<element name="cell"> > >+<attribute name="cpus"> > >+<ref name="Cellcpus"/> > > Typically, ref names start with a lower case letter. Ok, changed. > > >+</attribute> > >+<attribute name="mems"> > >+<ref name="Cellmems"/> > >+</attribute> > > Is it possible for these attributes to be optional? That is, on the > qemu line, can I specify cpus= but not mems=, or mems= but not > cpus=? If so, then the attributes need to be optional in the schema, > and the code behave with sane defaults when one of the two > attributes is not present. Actually qemu allows both cpus and mem to be optional. But if we allow that, we will have specifications like this: <numa> <cell /> <cell /> </numa> That looks ugly. Either both of them should allowed to be optional or none of them, I am going for the latter specification. IOW lets make both of them mandatory in a numa cell. > > >@@ -2745,4 +2767,14 @@ > > <param name="pattern">[a-zA-Z0-9_\.:]+</param> > > </data> > > </define> > >+<define name="Cellcpus"> > >+<data type="string"> > >+<param name="pattern">([0-9]+(-[0-9]+)?|\^[0-9]+)(,([0-9]+(-[0-9]+)?|\^[0-9]+))*</param> > > This looks like a repeat of <define name="cpuset">; if so, let's > reuse that define instead of making a new one (and if not, why are > we introducing yet another syntax?). Fixed. > > >+</data> > >+</define> > >+<define name="Cellmems"> > >+<data type="unsignedInt"> > >+<param name="pattern">[0-9]+</param> > > Likewise, this looks like a repeat of <define name="memoryKB">, so > lets reuse that. Done. > > >@@ -109,6 +114,19 @@ no_memory: > > return NULL; > > } > > > >+static int > >+virCPUDefNumaCPUs(virCPUDefPtr def) > >+{ > >+ int i, j, ncpus = 0; > >+ > >+ for (i = 0; i< def->ncells; i++) { > >+ for (j = 0; j< VIR_DOMAIN_CPUMASK_LEN; j++) { > >+ if (def->cells[i].cpumask[j]) > >+ ncpus++; > >+ } > >+ } > > Can this loop be made any faster by using count_one_bits? Yes provided we start storing CPUs in bitmasks rather than char arrays. But I saw other parts of the code (like cpuset) storing CPUs in array and hence used the same for numa too. In any case I got rid of this since we can arrive at this information from other means. > > >+ return ncpus; > >+} > > > > virCPUDefPtr > > virCPUDefParseXML(const xmlNodePtr node, > >@@ -289,6 +307,50 @@ virCPUDefParseXML(const xmlNodePtr node, > > def->features[i].policy = policy; > > } > > > >+ if (virXPathNode("./numa[1]", ctxt)) { > >+ VIR_FREE(nodes); > >+ n = virXPathNodeSet("./numa[1]/cell", ctxt,&nodes); > >+ if (n< 0 || n == 0) { > > This looks a bit funny, compared to if (n <= 0). Yes. I first accounted for the error case and later added a check for zero cells in <numa>. Fixed this. > > >+ for (i = 0 ; i< n ; i++) { > >+ char *cpus; > >+ int cpumasklen = VIR_DOMAIN_CPUMASK_LEN; > >+ unsigned long ul; > >+ int ret; > >+ > >+ def->cells[i].cellid = i; > >+ cpus = virXMLPropString(nodes[i], "cpus"); > >+ > >+ if (VIR_ALLOC_N(def->cells[i].cpumask, cpumasklen)< 0) > >+ goto no_memory; > >+ > >+ if (virDomainCpuSetParse((const char **)&cpus, > >+ 0, def->cells[i].cpumask, > >+ cpumasklen)< 0) > > Does this behave properly if the cpus=... attribute was missing? This was buggy. Fixed now. > > >+ goto error; > >+ > >+ ret = virXPathULong("string(./numa[1]/cell/@mems)", > >+ ctxt,&ul); > >+ if (ret< 0) { > >+ virCPUReportError(VIR_ERR_INTERNAL_ERROR, > >+ "%s", _("Missing 'mems' attribute in NUMA topology")); > >+ goto error; > > Especially since you checked for a missing mems= counterpart? And > back to my earlier question of whether both attributes are really > mandatory, or whether just one in isolation can make sense. As explained above, I made both of them mandatory now. > > >+++ b/src/conf/cpu_conf.h > >@@ -67,6 +67,14 @@ struct _virCPUFeatureDef { > > int policy; /* enum virCPUFeaturePolicy */ > > }; > > > >+typedef struct _virCellDef virCellDef; > >+typedef virCellDef *virCellDefPtr; > >+struct _virCellDef { > >+ int cellid; > >+ char *cpumask; /* CPUs that are part of this node */ > >+ unsigned int mem; /* Node memory */ > > I'd write this as /* Node memory in kB */, to make it clear what > scale is in use. Done. > > I saw the code for parsing the XML, but what about the code for > generating the xml during 'virsh dumpxml' (aka virDomainDefFormat)? > The patch is incomplete without that. Thanks for pointing this out. Added code for this now. Regards, Bharata. -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list