> -----Original Message----- > From: John Ferlan [mailto:jferlan@xxxxxxxxxx] > Sent: Wednesday, July 13, 2016 4:02 AM > To: Ren, Qiaowei <qiaowei.ren@xxxxxxxxx>; libvir-list@xxxxxxxxxx > Cc: Peter Krempa <pkrempa@xxxxxxxxxx> > Subject: Re: [PATCH 1/1] perf: add more perf events support > > > > On 06/29/2016 08:10 PM, Qiaowei Ren wrote: > > With current perf framework, this patch adds support to more perf > ^^^^^^^ for more perf > > > events, including cache missing, cache peference, cpu cycles, > > A quick google search turns up "cache references" - there's just too many > peference or peferences references to comment on them all, but they all need > to be "references" > John, according perf code from linux kernel, 'cache peference' is from linux kernel code. Certainly 'perf list' command show the 'references', and I will change it to this. > > instrction, etc.. > > instructions > > > > > Signed-off-by: Qiaowei Ren <qiaowei.ren@xxxxxxxxx> > > --- > > docs/formatdomain.html.in | 24 +++++++++++ > > docs/schemas/domaincommon.rng | 4 ++ > > include/libvirt/libvirt-domain.h | 39 +++++++++++++++++ > > src/libvirt-domain.c | 8 ++++ > > src/qemu/qemu_driver.c | 23 +++++----- > > src/util/virperf.c | 65 ++++++++++++++++++++++++++++- > > src/util/virperf.h | 4 ++ > > tests/genericxml2xmlindata/generic-perf.xml | 4 ++ > > 8 files changed, 158 insertions(+), 13 deletions(-) > > > > I see no changes for virsh.pod, see commit id '3110363d' for a recent change > Peter made in this space... > > I think perhaps it may also be worthwhile to "in a separate patch" alter the > 'domstats --perf' description to simply reference the 'perf' > description where each of the collect perf.* events can be listed and described. > > Each of the collectible events could have some sort of tabular output - see how > 'vol-wipe' describes the various supported algorithms. So much easier to read > than one long sentence. > Sure. I will add one patch about this. > > > diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in > > index f660aa6..7999e43 100644 > > --- a/docs/formatdomain.html.in > > +++ b/docs/formatdomain.html.in > > @@ -1839,6 +1839,10 @@ > > <event name='cmt' enabled='yes'/> > > <event name='mbmt' enabled='no'/> > > <event name='mbml' enabled='yes'/> > > + <event name='cache_misses' enabled='no'/> > > + <event name='cache_peferences' enabled='no'/> > > + <event name='instructions' enabled='no'/> > > + <event name='cpu_cycles' enabled='no'/> > > </perf> > > ... > > </pre> > > @@ -1864,6 +1868,26 @@ > > <td>bandwidth of memory traffic for a memory controller</td> > > <td><code>perf.mbml</code></td> > > </tr> > > + <tr> > > + <td><code>cache_misses</code></td> > > + <td>the amount of cache missing by applications running on the > > + platform</td> > > is this the count of caches misses? amount implies perhaps other things. > Yes. I will change it. > > + <td><code>perf.cache_misses</code></td> > > + </tr> > > + <tr> > > + <td><code>cache_peferences</code></td> > > + <td>the amount of cache hit by applications running on the platform</td> > > + <td><code>perf.cache_peferences</code></td> > > similar is this the count of cache hits, right? > Yes. > > + </tr> > > + <tr> > > + <td><code>instructions</code></td> > > + <td>the amount of instructions by applications running on the > > + platform</td> > > the count of instructions executed... > > > + <td><code>perf.instructions</code></td> > > + </tr> > > + <tr> > > + <td><code>cpu_cycles</code></td> > > + <td>the amount of cycles one instruction needs</td> > > the number/count of cpu cycles > > > + <td><code>perf.cpu_cycles</code></td> > > + </tr> > > </table> > > > > <h3><a name="elementsDevices">Devices</a></h3> > > diff --git a/docs/schemas/domaincommon.rng > > b/docs/schemas/domaincommon.rng index 563cb3c..e41dc3a 100644 > > --- a/docs/schemas/domaincommon.rng > > +++ b/docs/schemas/domaincommon.rng > > @@ -414,6 +414,10 @@ > > <value>cmt</value> > > <value>mbmt</value> > > <value>mbml</value> > > + <value>cache_misses</value> > > + <value>cache_peferences</value> > > + <value>instructions</value> > > + <value>cpu_cycles</value> > > </choice> > > </attribute> > > <attribute name="enabled"> > > diff --git a/include/libvirt/libvirt-domain.h > > b/include/libvirt/libvirt-domain.h > > index 7ea93aa..b79cdb0 100644 > > --- a/include/libvirt/libvirt-domain.h > > +++ b/include/libvirt/libvirt-domain.h > > @@ -1947,6 +1947,45 @@ void > virDomainStatsRecordListFree(virDomainStatsRecordPtr *stats); > > */ > > # define VIR_PERF_PARAM_MBML "mbml" > > > > +/** > > + * VIR_PERF_PARAM_CACHE_MISSES: > > + * > > + * Macro for typed parameter name that represents cache_misses perf > > + * event which can be used to measure the amount of cache missing by > > s/amount/count > > s/cache missing/cache misses/ (missing is a very different context!) > > > + * applications running on the platform. It corresponds to the > > + * "perf.cache_misses" field in the *Stats APIs. > > + */ > > +# define VIR_PERF_PARAM_CACHE_MISSES "cache_misses" > > + > > +/** > > + * VIR_PERF_PARAM_CACHE_REFERENCES: > > + * > > + * Macro for typed parameter name that represents cache_peferences > > + * perf event which can be used to measure the amount of cache hit > > similar... amount/count ... hit/hits > > > + * by applications running on the platform. It corresponds to the > > + * "perf.cache_peferences" field in the *Stats APIs. > > + */ > > +# define VIR_PERF_PARAM_CACHE_REFERENCES "cache_peferences" > > + > > +/** > > + * VIR_PERF_PARAM_INSTRUCTIONS: > > + * > > + * Macro for typed parameter name that represents instructions perf > > + * event which can be used to measure the amount of instructions > > similar amount/count > > > + * by applications running on the platform. It corresponds to the > > + * "perf.instructions" field in the *Stats APIs. > > + */ > > +# define VIR_PERF_PARAM_INSTRUCTIONS "instructions" > > + > > +/** > > + * VIR_PERF_PARAM_CPU_CYCLES: > > + * > > + * Macro for typed parameter name that represents cpu_cycles perf > > +event > > + * which can be used to measure how many cycles one instruction needs. > > + * It corresponds to the "perf.cpu_cycles" field in the *Stats APIs. > > + */ > > The cycles and instructions seem to me to be things that could be very large and > awkward to print > > +# define VIR_PERF_PARAM_CPU_CYCLES "cpu_cycles" > > + > > int virDomainGetPerfEvents(virDomainPtr dom, > > virTypedParameterPtr *params, > > int *nparams, diff --git > > a/src/libvirt-domain.c b/src/libvirt-domain.c index 4e71a94..b817e4b > > 100644 > > --- a/src/libvirt-domain.c > > +++ b/src/libvirt-domain.c > > @@ -11452,6 +11452,14 @@ virConnectGetDomainCapabilities(virConnectPtr > conn, > > * "perf.mbml" - the amount of data (bytes/s) sent through the memory > controller > > * on the socket as unsigned long long. It is produced by mbml > > * perf event. > > + * "perf.cache_misses" - the amount of cache missing as unsigned long long. > > + * It is produced by cache_misses perf event. > > + * "perf.cache_peferences" - the amount of cache hit as unsigned long long. > > + * It is produced by cache_peferences perf event. > > + * "perf.instructions" - the amount of instructions as unsigned long long. > > + * It is produced by instructions perf event. > > + * "perf.cpu_cycles" - the amount of cycles one instruction needs as > > + unsigned > > + * long long. It is produced by cpu_cycles perf event. > > Similar "amount" vs. "count" > > > * > > * Note that entire stats groups or individual stat fields may be missing from > > * the output in case they are not supported by the given hypervisor, > > are not diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c > > index 61d184b..bea753f 100644 > > --- a/src/qemu/qemu_driver.c > > +++ b/src/qemu/qemu_driver.c > > @@ -9613,6 +9613,10 @@ qemuDomainSetPerfEvents(virDomainPtr dom, > > VIR_PERF_PARAM_CMT, VIR_TYPED_PARAM_BOOLEAN, > > VIR_PERF_PARAM_MBMT, VIR_TYPED_PARAM_BOOLEAN, > > VIR_PERF_PARAM_MBML, > > VIR_TYPED_PARAM_BOOLEAN, > > + VIR_PERF_PARAM_CACHE_MISSES, > VIR_TYPED_PARAM_BOOLEAN, > > + VIR_PERF_PARAM_CACHE_REFERENCES, > VIR_TYPED_PARAM_BOOLEAN, > > + VIR_PERF_PARAM_INSTRUCTIONS, > VIR_TYPED_PARAM_BOOLEAN, > > + VIR_PERF_PARAM_CPU_CYCLES, > > + VIR_TYPED_PARAM_BOOLEAN, > > NULL) < 0) > > return -1; > > > > @@ -18941,10 +18945,10 @@ qemuDomainGetStatsBlock(virQEMUDriverPtr > > driver, #undef QEMU_ADD_COUNT_PARAM > > > > static int > > -qemuDomainGetStatsPerfRdt(virPerfPtr perf, > > - virPerfEventType type, > > - virDomainStatsRecordPtr record, > > - int *maxparams) > > +qemuDomainGetStatsPerfOneEvent(virPerfPtr perf, > > + virPerfEventType type, > > + virDomainStatsRecordPtr record, > > + int *maxparams) > > This change seems separable. That is it's own patch to change the name of the > function because it's going to be multiple/general purpose soon. > Ok. I will separate it from this patch. > > { > > char param_name[VIR_TYPED_PARAM_FIELD_LENGTH]; > > uint64_t value = 0; > > @@ -18980,14 +18984,9 @@ qemuDomainGetStatsPerf(virQEMUDriverPtr > driver ATTRIBUTE_UNUSED, > > if (!virPerfEventIsEnabled(priv->perf, i)) > > continue; > > > > - switch (i) { > > - case VIR_PERF_EVENT_CMT: > > - case VIR_PERF_EVENT_MBMT: > > - case VIR_PERF_EVENT_MBML: > > And removing the need for a switch could be separate too... Trying to think if > it's still necessary though. virPerfEventIsEnabled will return NULL if i >= > VIR_PERF_EVENT_LAST, so it doesn't seem some future client could request an > event that some older driver doesn't support. > virPerfEventIsEnabled will return false (not NULL) if I >= VIR_PERF_EVENT_LAST. Some future clients could not enable those events that older driver doesn't support. > > - if (qemuDomainGetStatsPerfRdt(priv->perf, i, record, maxparams) < 0) > > - goto cleanup; > > - break; > > - } > > + if (qemuDomainGetStatsPerfOneEvent(priv->perf, i, > > + record, maxparams) < 0) > > + goto cleanup; > > } > > > > ret = 0; > > diff --git a/src/util/virperf.c b/src/util/virperf.c index > > 4661ba3..a3d2bc6 100644 > > --- a/src/util/virperf.c > > +++ b/src/util/virperf.c > > @@ -38,7 +38,9 @@ VIR_LOG_INIT("util.perf"); #define VIR_FROM_THIS > > VIR_FROM_PERF > > > > VIR_ENUM_IMPL(virPerfEvent, VIR_PERF_EVENT_LAST, > > - "cmt", "mbmt", "mbml"); > > + "cmt", "mbmt", "mbml", > > + "cache_misses", "cache_peferences", > > + "instructions", "cpu_cycles"); > > > > struct virPerfEvent { > > int type; > > @@ -189,6 +191,60 @@ virPerfRdtEnable(virPerfEventPtr event, > > return -1; > > } > > > > +static int > > +virPerfGeneralEnable(virPerfEventPtr event, > > + pid_t pid) > > Currently, these are less "General" and more "Hardware" events > > Based on what I see in perf_event.h, I suspect the future could hold getting > software, tracepoint, hw_cache, raw, and breakpoint events too. > > Perhaps in order to be more "General" the "type" and "config" parameters could > be passed... > > > +{ > > + struct perf_event_attr attr; > > + > > + memset(&attr, 0, sizeof(attr)); > > + attr.size = sizeof(attr); > > + attr.inherit = 1; > > + attr.disabled = 1; > > + attr.enable_on_exec = 0; > > + > > + switch (event->type) { > > + case VIR_PERF_EVENT_CACHE_MISSES: > > + attr.type = PERF_TYPE_HARDWARE; > ^^^^^^^^^ > this doesn't change for any of the cases > > > + attr.config = PERF_COUNT_HW_CACHE_MISSES; > > + break; > > + case VIR_PERF_EVENT_CACHE_REFERENCES: > > + attr.type = PERF_TYPE_HARDWARE; > > + attr.config = PERF_COUNT_HW_CACHE_REFERENCES; > > + break; > > + case VIR_PERF_EVENT_INSTRUCTIONS: > > + attr.type = PERF_TYPE_HARDWARE; > > + attr.config = PERF_COUNT_HW_INSTRUCTIONS; > > + break; > > + case VIR_PERF_EVENT_CPU_CYCLES: > > + attr.type = PERF_TYPE_HARDWARE; > > + attr.config = PERF_COUNT_HW_CPU_CYCLES; > > + break; > > + } > > ...Seems like it would be possible to create some sort of static table/matrix that > would be able to convert the VIR_PERF_EVENT_* into their respective > "attr.type" and "attr.config", so that this function doesn't have the switch and > the calling function passes by value the 'type' and 'config'. Assuming of course > the future is to get other events. > Yes. This is a good idea. And I will add the static table. > > + > > + event->fd = syscall(__NR_perf_event_open, &attr, pid, -1, -1, 0); > > + if (event->fd < 0) { > > + virReportSystemError(errno, > > + _("Unable to open perf event for %s"), > > + virPerfEventTypeToString(event->type)); > > + goto error; > > + } > > + > > + if (ioctl(event->fd, PERF_EVENT_IOC_ENABLE) < 0) { > > + virReportSystemError(errno, > > + _("Unable to enable perf event for %s"), > > + virPerfEventTypeToString(event->type)); > > + goto error; > > + } > > + > > + event->enabled = true; > > + return 0; > > + > > + error: > > + VIR_FORCE_CLOSE(event->fd); > > + return -1; > > +} > > + > > int > > virPerfEventEnable(virPerfPtr perf, > > virPerfEventType type, @@ -205,6 +261,13 @@ > > virPerfEventEnable(virPerfPtr perf, > > if (virPerfRdtEnable(event, pid) < 0) > > I see this PerfRdt reference here which led me to wonder why both aren't being > changed, but I think I understand now. Of course, now I wonder what the Rdt > acronym means (is it Resource Director Technology?). If it's an acronym, it's nice > to see it spelled out at least once. Probably in some earlier commit which I didn't > chase. The lack of function comments is, well, frustrating. > The comments about RDT will be added. > > return -1; > > break; > > + case VIR_PERF_EVENT_CACHE_MISSES: > > + case VIR_PERF_EVENT_CACHE_REFERENCES: > > + case VIR_PERF_EVENT_INSTRUCTIONS: > > + case VIR_PERF_EVENT_CPU_CYCLES: > > + if (virPerfGeneralEnable(event, pid) < 0) > > + return -1; > > + break; > > here is where there could be some sort of tabular/matrix reference to get the > 'type' and 'config' values filled in to pass. > > > case VIR_PERF_EVENT_LAST: > > virReportError(VIR_ERR_INTERNAL_ERROR, > > _("Unexpected perf event type=%d"), type); > > diff --git a/src/util/virperf.h b/src/util/virperf.h index > > 7163410..7129370 100644 > > --- a/src/util/virperf.h > > +++ b/src/util/virperf.h > > @@ -28,6 +28,10 @@ typedef enum { > > VIR_PERF_EVENT_CMT, > > VIR_PERF_EVENT_MBMT, > > VIR_PERF_EVENT_MBML, > > + VIR_PERF_EVENT_CACHE_MISSES, > > + VIR_PERF_EVENT_CACHE_REFERENCES, > > + VIR_PERF_EVENT_INSTRUCTIONS, > > + VIR_PERF_EVENT_CPU_CYCLES, > > Not that it matters too much since the numbers are different, but the order here > is different than perf_hw_id. "Sometimes" it's kinder to access memory > sequentially rather than somewhat randomly. Easier to see for future changers > what was already too... > > Any reason to not go after the other 6 events in 'perf_hw_id'? > There are a lot of different perf events, but these 4 events will be used by OpenStack, and so currently I only add them in this patch. > > > > VIR_PERF_EVENT_LAST > > } virPerfEventType; > > diff --git a/tests/genericxml2xmlindata/generic-perf.xml > > b/tests/genericxml2xmlindata/generic-perf.xml > > index 394d2a6..6428ebd 100644 > > --- a/tests/genericxml2xmlindata/generic-perf.xml > > +++ b/tests/genericxml2xmlindata/generic-perf.xml > > @@ -16,6 +16,10 @@ > > <event name='cmt' enabled='yes'/> > > <event name='mbmt' enabled='no'/> > > <event name='mbml' enabled='yes'/> > > + <event name='cache_misses' enabled='no'/> > > + <event name='cache_peferences' enabled='no'/> > > + <event name='instructions' enabled='no'/> > > + <event name='cpu_cycles' enabled='no'/> > > All 4 are 'no', maybe make 1 or 2 'yes' (not that it matters, but they are a > different table underneath the covers) > Yes. Thanks, Qiaowei -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list