On Tue, Jun 28, 2022 at 05:23:11PM +0200, Michal Prívozník wrote: > On 6/24/22 10:14, Amneesh Singh wrote: > > Related: https://gitlab.com/libvirt/libvirt/-/issues/276 > > > > This patch uses qemuMonitorQueryStats to query "halt_poll_success_ns" > > and "halt_poll_fail_ns" for every vCPU. The respective values for each > > vCPU are then added together. > > > > Signed-off-by: Amneesh Singh <natto@xxxxxxxxxxxxx> > > --- > > src/qemu/qemu_driver.c | 48 ++++++++++++++++++++++++++++++++++++++---- > > 1 file changed, 44 insertions(+), 4 deletions(-) > > > > diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c > > index 3b5c3db6..0a2be6d3 100644 > > --- a/src/qemu/qemu_driver.c > > +++ b/src/qemu/qemu_driver.c > > @@ -18057,10 +18057,50 @@ qemuDomainGetStatsCpuHaltPollTime(virDomainObj *dom, > > { > > unsigned long long haltPollSuccess = 0; > > unsigned long long haltPollFail = 0; > > - pid_t pid = dom->pid; > > + qemuDomainObjPrivate *priv = dom->privateData; > > + bool canQueryStats = virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_QUERY_STATS); > > Is this variable really needed? I mean, we can just: > > if (virQEMUCapsGet(...) { > > } else { > > } > > But if you want to avoid long lines, then perhaps rename the variable to > queryStatsCap? This way it's more obvious what the variable reflects. > Stats can be queried in both cases ;-) Sure, that sounds doable :) > > > > > - if (virHostCPUGetHaltPollTime(pid, &haltPollSuccess, &haltPollFail) < 0) > > - return 0; > > + if (!canQueryStats) { > > + pid_t pid = dom->pid; > > + > > + if (virHostCPUGetHaltPollTime(pid, &haltPollSuccess, &haltPollFail) < 0) > > + return 0; > > + } else { > > + size_t i; > > + qemuMonitorQueryStatsTargetType target = QEMU_MONITOR_QUERY_STATS_TARGET_VCPU; > > + qemuMonitorQueryStatsProvider *provider = NULL; > > + g_autoptr(GPtrArray) providers = NULL; > > + g_autoptr(GPtrArray) queried_stats = NULL; > > + const char *success_str = "halt_poll_success_ns"; > > + const char *fail_str = "halt_poll_fail_ns"; > > + > > + provider = qemuMonitorQueryStatsProviderNew(QEMU_MONITOR_QUERY_STATS_PROVIDER_KVM); > > + provider->names = g_new0(char *, 3); > > + provider->names[0] = g_strdup(success_str), provider->names[1] = g_strdup(fail_str); > > I'm starting to wonder whether this is a good interface. These ->names[] > array is never changed, so maybe we can have it as a NULL terminated > list of constant strings? For instance: > > provider = qemuMonitorQueryStatsProviderNew(); > provider->names = {"halt_poll_success_ns", "halt_poll_fail_ns", NULL}; Well, cannot really assign char ** with a scalar initialization, but what might work is something like const char *names[] = { success_str, fail_str, NULL }; provider = qemuMonitorQueryStatsProviderNew(QEMU_MONITOR_QUERY_STATS_PROVIDER_KVM); provider->names = g_strdupv((char **) names); Another thought was using GStrvBuilder but it is not avaiable as per GLIB_VERSION_MAX. I too think that the current approach is not very nice. A variadic function similar to g_strv_builder_add_many that initializes a char ** would be nice. > > > + > > + providers = g_ptr_array_new_full(1, (GDestroyNotify) qemuMonitorQueryStatsProviderFree); > > + g_ptr_array_add(providers, provider); > > + > > + queried_stats = qemuMonitorQueryStats(priv->mon, target, NULL, providers); > > This issues monitor command without proper checks. Firstly, you need to > check whether job acquiring (that s/false/true/ change done in the last > hunk) was successful and the domain is still running: > > if (!HAVE_JOB(privflags) || !virDomainObjIsActive(dom)) { > /* code here */ > } > > Then, you need to so called "enter the monitor" and "exit the monitor". > So what happens here is that @vm is when this function is called. Now, > issuing a monitor command with the domain object lock held is > potentially very dangerous because if QEMU takes long time to reply (or > it doesn't reply at all) the domain object is locked an can't be > destroyed (virsh destroy) - because the first thing that > qemuDomainDestroyFlags() does is locking the domain object. Also, you > want to make sure no other thread is currently talking on the monitor - > so the monitor has to be locked too. We have two convenient functions > for these operations: > > qemuDomainObjEnterMonitor() > qemuDomainObjExitMonitor() > > This is all described in more detail in > docs/kbase/internals/qemu-threads.rst. > > Having said all of this, I think we can fallback to the current behavior > if job wasn't acquired successfully. Therefore the condition at the very > beginning might look like this: > > if (queryStatsCap && HAVE_JOB() && virDomainObjIsActive()) { > ... > qemuDomainEnterMonitor(); > qemuMonitorQueryStats(); > qemuDomainExitMonitor(); > } else { > virHostCPUGetHaltPollTime(); > } > I see that makes sense. I shall do the necessary changes. Thanks for the detailed explanation. > > + > > + if (!queried_stats) > > + return 0; > > + > > + for (i = 0; i < queried_stats->len; i++) { > > + unsigned long long curHaltPollSuccess, curHaltPollFail; > > + GHashTable *cur_table = queried_stats->pdata[i]; > > + virJSONValue *success, *fail; > > + > > + success = g_hash_table_lookup(cur_table, success_str); > > + fail = g_hash_table_lookup(cur_table, fail_str); > > + > > + ignore_value(virJSONValueGetNumberUlong(success, &curHaltPollSuccess)); > > + ignore_value(virJSONValueGetNumberUlong(fail, &curHaltPollFail)); > > I don't think this is a safe thing to do. What if either of @success or > @fail is NULL? That can happen when QEMU didn't return requested member. > True, at that point the 'query-stats' command should have failed, but I > think it's more defensive if we checked these pointers. My worry is that > virJSONValueGetNumberUlong() dereferences the pointer right away. > I do understand that this is not an unfounded worry. I shall put the checks there. > > + > > + haltPollSuccess += curHaltPollSuccess; > > + haltPollFail += curHaltPollFail; > > + } > > + } > > > > if (virTypedParamListAddULLong(params, haltPollSuccess, "cpu.haltpoll.success.time") < 0 || > > virTypedParamListAddULLong(params, haltPollFail, "cpu.haltpoll.fail.time") < 0) > > @@ -18915,7 +18955,7 @@ static virQEMUCapsFlags queryDirtyRateRequired[] = { > > > > static struct qemuDomainGetStatsWorker qemuDomainGetStatsWorkers[] = { > > { qemuDomainGetStatsState, VIR_DOMAIN_STATS_STATE, false, NULL }, > > - { qemuDomainGetStatsCpu, VIR_DOMAIN_STATS_CPU_TOTAL, false, NULL }, > > + { qemuDomainGetStatsCpu, VIR_DOMAIN_STATS_CPU_TOTAL, true, NULL }, > > { qemuDomainGetStatsBalloon, VIR_DOMAIN_STATS_BALLOON, true, NULL }, > > { qemuDomainGetStatsVcpu, VIR_DOMAIN_STATS_VCPU, true, NULL }, > > { qemuDomainGetStatsInterface, VIR_DOMAIN_STATS_INTERFACE, false, NULL }, > > Please feel free to object to anything I wrote. Maybe you have more > patches written on a local branch that justify your choices. I don't know. > > Michal > Thanks for taking the time to review the patches. Amneesh
Attachment:
signature.asc
Description: PGP signature