On 11/6/2018 4:04 AM, John Ferlan
wrote:
On 10/22/18 4:01 AM, Wang Huaqiang wrote:Adding the interface in qemu to report CMT statistic information through command 'virsh domstats --cpu-total'. Below is a typical output: # virsh domstats 1 --cpu-total Domain: 'ubuntu16.04-base' ... cpu.cache.monitor.count=2 cpu.cache.monitor.0.name=vcpus_1 cpu.cache.monitor.0.vcpus=1 cpu.cache.monitor.0.bank.count=2 cpu.cache.monitor.0.bank.0.id=0 cpu.cache.monitor.0.bank.0.bytes=4505600 cpu.cache.monitor.0.bank.1.id=1 cpu.cache.monitor.0.bank.1.bytes=5586944 cpu.cache.monitor.1.name=vcpus_4-6 cpu.cache.monitor.1.vcpus=4,5,6 cpu.cache.monitor.1.bank.count=2 cpu.cache.monitor.1.bank.0.id=0 cpu.cache.monitor.1.bank.0.bytes=17571840 cpu.cache.monitor.1.bank.1.id=1 cpu.cache.monitor.1.bank.1.bytes=29106176 Signed-off-by: Wang Huaqiang <huaqiang.wang@xxxxxxxxx> --- src/libvirt-domain.c | 9 ++ src/libvirt_private.syms | 1 + src/qemu/qemu_driver.c | 229 +++++++++++++++++++++++++++++++++++++++++++++++ src/util/virresctrl.c | 130 +++++++++++++++++++++++++++ src/util/virresctrl.h | 12 +++ 5 files changed, 381 insertions(+)I have a feeling I'll be asking for this to be split up a bit, but let's see... There's *util, *qemu, and *API code in the same patch. diff --git a/src/libvirt-domain.c b/src/libvirt-domain.c index 7690339..4895f9f 100644 --- a/src/libvirt-domain.c +++ b/src/libvirt-domain.c @@ -11345,6 +11345,15 @@ virConnectGetDomainCapabilities(virConnectPtr conn, * "cpu.user" - user cpu time spent in nanoseconds as unsigned long long. * "cpu.system" - system cpu time spent in nanoseconds as unsigned long * long. + * "cpu.cache.monitor.count" - tocal cache monitoring groups + * "cpu.cache.monitor.M.name" - name of cache monitoring group 'M' + * "cpu.cache.monitor.M.vcpus" - vcpus for cache monitoring group 'M' + * "cpu.cache.monitor.M.bank.count" - total bank number of cache monitoring + * group 'M' + * "cpu.cache.monitor.M.bank.N.id" - OS assigned cache bank id for cache + * 'N' in cache monitoring group 'M' + * "cpu.cache.monitor.M.bank.N.bytes" - monitor's cache occupancy of cache + * bank 'N' in cache monitoring group 'M' * * VIR_DOMAIN_STATS_BALLOON: * Return memory balloon device information. diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 91801ed..4551767 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2683,6 +2683,7 @@ virResctrlInfoNew; virResctrlMonitorAddPID; virResctrlMonitorCreate; virResctrlMonitorDeterminePath; +virResctrlMonitorGetCacheOccupancy; virResctrlMonitorGetID; virResctrlMonitorIsRunning; virResctrlMonitorNew; diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 12a5f8f..9828118 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -102,6 +102,7 @@ #include "virnuma.h" #include "dirname.h" #include "netdev_bandwidth_conf.h" +#include "c-ctype.h"Ahh.. this one's a red flag... Says to me that something should move to a util function... Got. Header file will be removed (actually I planned to remove all code using it.) #define VIR_FROM_THIS VIR_FROM_QEMU @@ -19698,6 +19699,230 @@ typedef enum { #define HAVE_JOB(flags) ((flags) & QEMU_DOMAIN_STATS_HAVE_JOB) +/* In terms of the output of virBitmapFormat, both '1-3' and '1,3' are valid + * outputs and represent different vcpu set. But it is not easy to + * differentiate these two vcpu set formats at first glance. + * + * This function could be used to clear this ambiguity, it substitutes all '-' + * with ',' while generates semantically correct vcpu set. + * e.g. vcpu set string '1-3' will be replaced by string '1,2,3'. */ +static char * +qemuDomainVcpuFormatHelper(const char *vcpus) +{ + size_t i = 0; + int last = 0; + int start = 0; + char * tmp = NULL; + bool firstnum = true; + const char *cur = vcpus; + virBuffer buf = VIR_BUFFER_INITIALIZER; + char *ret = NULL; + + if (virStringIsEmpty(cur)) + return NULL; + + while (*cur != '\0') { + if (!c_isdigit(*cur))Explains the #include, but in the long run, I don't think this method is worth the effort since all you're doing is printing the format in the output. Is there other libvirt generated output for cpu stats that does a similar conversion? If so, share code, if not drop this. No need to rearrange the range someone else entered and we've succesfully parsed in some manner. I think the output should look like the XML output unless there's some compelling reason to change it which I don't see. >From above the:cpu.cache.monitor.1.name=vcpus_4-6 cpu.cache.monitor.1.vcpus=4,5,6would then be:cpu.cache.monitor.1.name=vcpus_4-6 cpu.cache.monitor.1.vcpus=4-6right? Yes, this heavy function is just re-formatting the output of vcpus. I also think it not worth to do this kind of reformat. Let's remove it. I don't even want to think about someone who does "7,4-6"...+ goto cleanup; + + if (virStrToLong_i(cur, &tmp, 10, &start) < 0) + goto cleanup; + if (start < 0) + goto cleanup; + + cur = tmp; + + virSkipSpaces(&cur); + + if (*cur == ',' || *cur == 0) { + if (!firstnum) + virBufferAddChar(&buf, ','); + virBufferAsprintf(&buf, "%d", start); + firstnum = false; + } else if (*cur == '-') { + cur++; + virSkipSpaces(&cur); + + if (virStrToLong_i(cur, &tmp, 10, &last) < 0) + goto cleanup; + + if (last < start) + goto cleanup; + cur = tmp; + + for (i = start; i <= last; i++) { + if (!firstnum) + + virBufferAddChar(&buf, ','); + virBufferAsprintf(&buf, "%ld", i); + firstnum = 0; + } + + virSkipSpaces(&cur); + } + + if (*cur == ',') { + cur++; + virSkipSpaces(&cur); + } else if (*cur == 0) { + break; + } else { + goto cleanup; + } + } + + ret = virBufferContentAndReset(&buf); + cleanup: + virBufferFreeAndReset(&buf); + return ret; +} + + +static int +qemuDomainGetStatsCpuResource(virQEMUDriverPtr driver ATTRIBUTE_UNUSED, + virDomainObjPtr dom, + virDomainStatsRecordPtr record, + int *maxparams, + unsigned int privflags ATTRIBUTE_UNUSED, + virResctrlMonitorType restag)Why bother passing ATTRIBUTE_UNUSED ? I just copy-pasted the function header from some other function. Will be removed since it is an internal function. +{ + char param_name[VIR_TYPED_PARAM_FIELD_LENGTH]; + virDomainResctrlMonDefPtr domresmon = NULL; + virResctrlMonitorStatsPtr stats = NULL; + size_t nstats = 0;since these are local to the: + for (j = 0; j < resctrl->nmonitors; j++) { loop and there's local's in that for loop, maybe these two should move there too... If the lines with virResctrlMonitorIsRunning could be removed, then this loop is not necessary any more. virResctrlMonitorIsRunning is validating monitor through verifying PID lists, but it seems there is no need to do such kind of validation and the PIDs in *tasks file will not change without control: -. The PID in *tasks file will be removed by kernel if vcpu process is terminated, but the vcpu process will not exit except performing vcpu hot-plug. -. In process of libvirt re-connection, VM will not make any changes in terms of resctrl groups and vcpus, so the content of *tasks will not change and the layout of resctrl groups will not change. -. Any changes out of libvirt are not allowed, for example, manually creating or deleting any resctrl groups by bash commands are not allowed. I decided to remove the lines invoking virResctrlMonitorIsRunning, and the loop will be removed. Actually I will split this function into three functions: qemuDomainGetCpuResMonitorData qemuDomainGetStatsCpuResMonitorPerTag and qemuDomainGetStatsCPUResMonitor qemuDomainGetCpuResMonitorData will fetch data such as the monitor group ID, vcpus, and cache information for all monitors, any error happened in this function will reported to user through virReportError. So the error will be reported to user before dumping the final result. qemuDomainGetStatsCpuResMonitorPerTag is responsible for showing the data retrieved by a successful call of qemuDomainGetCpuResMonitorData. In this function it does not dump any error message created by virReportError which will interrupt the showing of cache monitor information. qemuDomainGetStatsCPUResMonitor is the interface called by qemuDomainGetStatsCPU. we could invoke qemuDomainGetStatsCpuResMonitorPerTag with tag VIR_RESCTRL_MONITOR_TYPE_CACHE for cache monitor. memBW monitor result will be processed by invoking this function through specify tag VIR_RESCTRL_MONITOR_TYPE_MEMBW. + virDomainResctrlDefPtr resctrl = NULL; + unsigned int nmonitors = 0; + unsigned int imonitor = 0; + const char *restype = NULL; + char *rawvcpus = NULL; + char *vcpus = NULL; + size_t i = 0; + size_t j = 0; + int ret = -1; + + if (!virDomainObjIsActive(dom)) + return 0; + + if (restag == VIR_RESCTRL_MONITOR_TYPE_CACHE) { + restype = "cache"; + } else { + VIR_DEBUG("Invalid CPU resource type");VIR_DEBUG? virReportError would seem to be more appropriate. As I've stated before there will be an error for some reason generated. Originally, the purpose of using VIR_DEBUG,not virReportError, to report error is to avoid showing partial monitor information with an error message on screen. I will split this function into three functions, qemuDomainGetCpuResMonitorData qemuDomainGetStatsCpuResMonitorPerTag and qemuDomainGetStatsCPUResMonitor qemuDomainGetCpuResMonitorData will get necessary data and reported any error occurred to user promptly. qemuDomainGetStatsCpuResMonitorPerTag and qemuDomainGetStatsCPUResMonitor will report the cache monitor result. This will avoid the mix of cache monitor result and error message if error occurs. Please review my new code. + return -1; + } + + for (i = 0; i < dom->def->nresctrls; i++) { + resctrl = dom->def->resctrls[i]; + + for (j = 0; j < resctrl->nmonitors; j++) { + domresmon = resctrl->monitors[j]; + if (virResctrlMonitorIsRunning(domresmon->instance) && + domresmon->tag == restag)Knowing how heavy wait *IsRunning is, the order of checking should be reversed at the very least. virResctrlMonitorIsRunning will not be called any more as I stated. and the code will be changed, my new code will be more efficient. + nmonitors++; + } + } + + if (nmonitors) { + snprintf(param_name, VIR_TYPED_PARAM_FIELD_LENGTH, + "cpu.%s.monitor.count", restype); + if (virTypedParamsAddUInt(&record->params, + &record->nparams, + maxparams, + param_name, + nmonitors) < 0) + goto cleanup; + }All that above to ensure nmonitors == resctrl->nmonitors??? I think there's a lot of unnecessary stuff. This part will be removed. Here the purpose of code is to get the total 'valid' (after a successful check of virResctrlMonitorIsRunning) monitor groups. It actually performing the validation of nmonitors == resctrl->nmonitors, just as you said. Now I think there is no need to validate the monitor any more. + + for (i = 0; i < dom->def->nresctrls; i++) { + resctrl = dom->def->resctrls[i]; + + for (j = 0; j < resctrl->nmonitors; j++) { + size_t l = 0; + virResctrlMonitorPtr monitor = resctrl->monitors[j]->instance; + const char *id = virResctrlMonitorGetID(monitor); + + if (!id) + goto cleanup;I would think you have other problems in this case. Besides there's no error message associated with this. Error message should be reported. And will be reported. Please review the new code. + + domresmon = resctrl->monitors[j]; + + if (!virResctrlMonitorIsRunning(domresmon->instance)) + continue;Oh my one call for each iteration, ouch - highly inefficient. virResctrlMonitorIsRunning disappears in code and patch for this file will be redesigned. Please renew the new code. If virResctrlMonitorGetCacheOccupancy returns without error, nstats will be assigned with+ + if (!(rawvcpus = virBitmapFormat(domresmon->vcpus))) + goto cleanup; + + vcpus = qemuDomainVcpuFormatHelper(rawvcpus); + if (!vcpus) + goto cleanup; + + if (virResctrlMonitorGetCacheOccupancy(monitor, &stats, + &nstats) < 0) + goto cleanup; + + snprintf(param_name, VIR_TYPED_PARAM_FIELD_LENGTH, + "cpu.%s.monitor.%d.name", restype, imonitor); + if (virTypedParamsAddString(&record->params, + &record->nparams, + maxparams, + param_name, + id) < 0) + goto cleanup; + + snprintf(param_name, VIR_TYPED_PARAM_FIELD_LENGTH, + "cpu.%s.monitor.%d.vcpus", restype, imonitor); + + if (virTypedParamsAddString(&record->params, + &record->nparams, + maxparams, + param_name, + vcpus) < 0) + goto cleanup; + + snprintf(param_name, VIR_TYPED_PARAM_FIELD_LENGTH, + "cpu.%s.monitor.%d.bank.count", restype, imonitor); + if (virTypedParamsAddUInt(&record->params, + &record->nparams, + maxparams, + param_name, + nstats) < 0) + goto cleanup; + + for (l = 0; l < nstats; l++) { + snprintf(param_name, VIR_TYPED_PARAM_FIELD_LENGTH, + "cpu.%s.monitor.%d.bank.%ld.id", + restype, imonitor, l); + if (virTypedParamsAddUInt(&record->params, + &record->nparams, + maxparams, + param_name, + stats[l].id) < 0) + goto cleanup; + + + snprintf(param_name, VIR_TYPED_PARAM_FIELD_LENGTH, + "cpu.%s.monitor.%d.bank.%ld.bytes", + restype, imonitor, l); + if (virTypedParamsAddUInt(&record->params, + &record->nparams, + maxparams, + param_name, + stats[l].val) < 0) + goto cleanup; + } + + VIR_FREE(stats);nstats = 0; the proper bank group number. If virResctrlMonitorGetCacheOccupancy returns an error, then code will go through cleanup path and returns. Seems no need to clear nstats to 0. Any way in my new code, this nstats will not be used any more, please review the logic of new code later. too!+ VIR_FREE(vcpus); + imonitor++; + } + } + + ret = 0; + cleanup: + VIR_FREE(vcpus); + return ret; +} + + static int qemuDomainGetStatsCpu(virQEMUDriverPtr driver ATTRIBUTE_UNUSED, virDomainObjPtr dom, @@ -19735,6 +19960,10 @@ qemuDomainGetStatsCpu(virQEMUDriverPtr driver ATTRIBUTE_UNUSED, return -1; } + if (qemuDomainGetStatsCpuResource(driver, dom, record, maxparams, privflags, + VIR_RESCTRL_MONITOR_TYPE_CACHE) < 0) + return -1; + return 0; } diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index fa3e6e9..02e7095 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -2713,3 +2713,133 @@ virResctrlMonitorIsRunning(virResctrlMonitorPtr monitor) return ret; } + + +static int +virResctrlMonitorStatsSorter(const void *a, + const void *b) +{ + return ((virResctrlMonitorStatsPtr)a)->id + - ((virResctrlMonitorStatsPtr)b)->id; +} + + +/* + * virResctrlMonitorGetStats + * + * @monitor: The monitor that the statistic data will be retrieved from. + * @resource: The name for resource name. 'llc_occupancy' for cache resource. + * "mbm_total_bytes" and "mbm_local_bytes" for memory bandwidth resource. + * @stats: Array of virResctrlMonitorStatsPtr for holding cache or memory + * bandwidth usage data. + * @nstats: A size_t pointer to hold the returned array length of @stats + * + * Get cache or memory bandwidth utilization information. + * + * Returns 0 on success, -1 on error. + */ +static int +virResctrlMonitorGetStats(virResctrlMonitorPtr monitor, + const char *resource, + virResctrlMonitorStatsPtr *stats, + size_t *nstats) +{ + int rv = -1; + int ret = -1; + DIR *dirp = NULL; + char *datapath = NULL; + struct dirent *ent = NULL; + virResctrlMonitorStatsPtr stat = NULL; + + if (!monitor) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Invalid resctrl monitor")); + return -1; + } + + if (virAsprintf(&datapath, "%s/mon_data", monitor->path) < 0) + return -1; + + if (virDirOpen(&dirp, datapath) < 0) + goto cleanup; + + *nstats = 0; + while (virDirRead(dirp, &ent, datapath) > 0) { + char *node_id = NULL; + + if (VIR_ALLOC(stat) < 0) + goto cleanup; + + /* Looking for directory that contains resource utilization + * information file. The directory name is arranged in format + * "mon_<node_name>_<node_id>". For example, "mon_L3_00" and + * "mon_L3_01" are two target directories for a two nodes system + * with resource utilization data file for each node respectively. + */ + if (ent->d_type != DT_DIR) + continue; + + /* Looking for directory has a prefix 'mon_L' */ + if (!(node_id = STRSKIP(ent->d_name, "mon_L"))) + continue; + + /* Looking for directory has another '_' */ + node_id = strchr(node_id, '_'); + if (!node_id) + continue; + + /* Skip the character '_' */ + if (!(node_id = STRSKIP(node_id, "_"))) + continue; + + /* The node ID number should be here, parsing it. */ + if (virStrToLong_uip(node_id, NULL, 0, &stat->id) < 0) + goto cleanup; + + rv = virFileReadValueUint(&stat->val, "%s/%s/%s", datapath, + ent->d_name, resource); + if (rv == -2) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("File '%s/%s/%s' does not exist."), + datapath, ent->d_name, resource); + } + if (rv < 0) + goto cleanup; + + if (VIR_APPEND_ELEMENT(*stats, *nstats, *stat) < 0) + goto cleanup; + } + + /* Sort in id's ascending order */ + qsort(*stats, *nstats, sizeof(*stat), virResctrlMonitorStatsSorter);Coverity notes that *stats could be NULL - if the above while loop gets nothing... Will be changed to if (*nstats) qsort(*stats, *nstats, sizeof(*stat), virResctrlMonitorStatsSorter); + + ret = 0; + cleanup: + VIR_FREE(datapath); + VIR_FREE(stat); + VIR_DIR_CLOSE(dirp); + return ret; +} + + +/* + * virResctrlMonitorGetCacheOccupancy + * + * @monitor: The monitor that the statistic data will be retrieved from. + * @caches: Array of virResctrlMonitorStatsPtr for receiving cache occupancy + * data. Caller is responsible to free this array. + * @ncaches: A size_t pointer to hold the returned array length of @cachesThey're @stats and @nstats elsewhere, be consistent. Will use @stats and @nstats for virResctrlMonitorGetCacheOccupancy and later API (for memBW). + * + * Get cache or memory bandwidth utilization information. + * + * Returns 0 on success, -1 on error. + */ + +int +virResctrlMonitorGetCacheOccupancy(virResctrlMonitorPtr monitor, + virResctrlMonitorStatsPtr *caches, + size_t *ncaches)They're @stats and @nstats elsewhere, be consistent. Will use @stats and @nstats for parameters. +{ + return virResctrlMonitorGetStats(monitor, "llc_occupancy", + caches, ncaches); +} diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h index 8d8fdc2..004c40e 100644 --- a/src/util/virresctrl.h +++ b/src/util/virresctrl.h @@ -191,6 +191,13 @@ virResctrlInfoGetMonitorPrefix(virResctrlInfoPtr resctrl, typedef struct _virResctrlMonitor virResctrlMonitor; typedef virResctrlMonitor *virResctrlMonitorPtr; +typedef struct _virResctrlMonitorStats virResctrlMonitorStats; +typedef virResctrlMonitorStats *virResctrlMonitorStatsPtr; +struct _virResctrlMonitorStats { + unsigned int id; + unsigned int val; +}; + virResctrlMonitorPtr virResctrlMonitorNew(void); @@ -222,4 +229,9 @@ virResctrlMonitorRemove(virResctrlMonitorPtr monitor); bool virResctrlMonitorIsRunning(virResctrlMonitorPtr monitor); + +int +virResctrlMonitorGetCacheOccupancy(virResctrlMonitorPtr monitor, + virResctrlMonitorStatsPtr *caches, + size_t *ncaches);They're @stats and @nstats elsewhere, be consistent. Will use @stats and @nstats for parameters. #endif /* __VIR_RESCTRL_H__ */I'm sure there's more I'll pick up later, but you should have plenty for now. As to splitting - I think some splitting could be done, I'll leave it up to you... Lots of change across multiple areas - ask yourself - can something be separated out? The virresctrl.* part could be separated, and how about merging this part with previous patch that adding more monitor interfaces in virresctrl.*. I'll do this in my next update for your review. John Thanks for review. Huaqiang |
-- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list