On 06/14/2018 04:10 PM, John Ferlan wrote: > > [...] > >>>> diff --git a/include/libvirt/libvirt-domain.h b/include/libvirt/libvirt-domain.h >>>> index da773b76cb..1a1d34620d 100644 >>>> --- a/include/libvirt/libvirt-domain.h >>>> +++ b/include/libvirt/libvirt-domain.h >>>> @@ -2055,6 +2055,7 @@ typedef enum { >>>> VIR_CONNECT_GET_ALL_DOMAINS_STATS_SHUTOFF = VIR_CONNECT_LIST_DOMAINS_SHUTOFF, >>>> VIR_CONNECT_GET_ALL_DOMAINS_STATS_OTHER = VIR_CONNECT_LIST_DOMAINS_OTHER, >>>> >>>> + VIR_CONNECT_GET_ALL_DOMAINS_STATS_BEST_EFFORT = 1 << 29, /* ignore stalled domains */ >>> >>> "stalled"? How about "don't wait on other jobs" >> >> Well, my hidden idea was also that we can "misuse" this flag to not wait >> on other places too. For instance, if we find out (somehow) that a >> domain is in D state, we would consider it stale even without any job >> running on it. Okay, we have no way of detecting if qemu is in D state >> right now, but you get my point. If we don't lock this flag down to just >> domain jobs (that not all drivers have btw), we can use it more widely. >> > > Ahhh, I see what you're driving at - some flag name which allows us to > reuse the name for other conditions which have caused delays in > collection due to some underlying condition... > > a/k/a: {AVOID|IGNORE}_BLOCKING_CONDITIONS... > >>> >>>> VIR_CONNECT_GET_ALL_DOMAINS_STATS_BACKING = 1 << 30, /* include backing chain for block stats */ >>>> VIR_CONNECT_GET_ALL_DOMAINS_STATS_ENFORCE_STATS = 1U << 31, /* enforce requested stats */ > > Would the new flag be mutually exclusive to ENFORCE? I want some list > of stats, but don't wait to get the answer. I thought about this and I don't think so. So current behaviour is: with enforce you still might miss some stats if acquiring job fails. ENFORCE merely tells libvirt to check if requested stats are available (=driver knows how to fetch requested stats). Therefore I don't think these two flags are mutually exclusive. > >>>> } virConnectGetAllDomainStatsFlags; >>>> diff --git a/src/libvirt-domain.c b/src/libvirt-domain.c >>>> index d44b553c74..906b097adb 100644 >>>> --- a/src/libvirt-domain.c >>>> +++ b/src/libvirt-domain.c >>>> @@ -11502,6 +11502,11 @@ virConnectGetDomainCapabilities(virConnectPtr conn, >>>> * fields for offline domains if the statistics are meaningful only for a >>>> * running domain. >>>> * >>>> + * Passing VIR_CONNECT_GET_ALL_DOMAINS_STATS_BEST_EFFORT in >>>> + * @flags means when libvirt is unable to fetch stats for any of >>>> + * the domains (for whatever reason) such domain is silently >>>> + * ignored. >>>> + * >>> >>> So we're adding this for both capabilities and... >> >> Not really, this is jut git generating misleading diff (for human). In >> fact this flag is added to virConnectGetAllDomainStats() and >> virDomainListGetStats(). >> >>> >>>> * Similarly to virConnectListAllDomains, @flags can contain various flags to >>>> * filter the list of domains to provide stats for. >>>> * >>>> @@ -11586,6 +11591,11 @@ virConnectGetAllDomainStats(virConnectPtr conn, >>>> * fields for offline domains if the statistics are meaningful only for a >>>> * running domain. >>>> * >>>> + * Passing VIR_CONNECT_GET_ALL_DOMAINS_STATS_BEST_EFFORT in >>>> + * @flags means when libvirt is unable to fetch stats for any of >>>> + * the domains (for whatever reason) such domain is silently >>>> + * ignored. >>>> + * >>> >>> ...stats in the API documentation, but... >>> >>> BTW: the domain isn't silently ignored, instead only a subset of >>> statistics is returned for the domain. That subset being statistics >>> that don't involve querying the underlying hypervisor. >> >> OKay. >> >>> >>>> * Note that any of the domain list filtering flags in @flags may be rejected >>>> * by this function. >>>> * >>>> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c >>>> index 38ea865ce3..28338de05f 100644 >>>> --- a/src/qemu/qemu_driver.c >>>> +++ b/src/qemu/qemu_driver.c >>>> @@ -20446,6 +20446,7 @@ qemuConnectGetAllDomainStats(virConnectPtr conn, >>> >>> ...only adding the check in the DomainStats? >> >> Sure. Because what virDomainListGetStats does is it calls >> driver->connectGetAllDomainStats() just like >> virConnectGetAllDomainStats() does. So at the driver side there's only >> one function serving two public APIs. >> > > The ... are all related - you've answered the concern above. I was too > lazy to look at the actual placement - just took the git diff output and > assumed Capabilities. > >>> >>>> virCheckFlags(VIR_CONNECT_LIST_DOMAINS_FILTERS_ACTIVE | >>>> VIR_CONNECT_LIST_DOMAINS_FILTERS_PERSISTENT | >>>> VIR_CONNECT_LIST_DOMAINS_FILTERS_STATE | >>>> + VIR_CONNECT_GET_ALL_DOMAINS_STATS_BEST_EFFORT | >>>> VIR_CONNECT_GET_ALL_DOMAINS_STATS_BACKING | >>>> VIR_CONNECT_GET_ALL_DOMAINS_STATS_ENFORCE_STATS, -1); >>>> >>>> @@ -20480,9 +20481,17 @@ qemuConnectGetAllDomainStats(virConnectPtr conn, >>>> >>>> virObjectLock(vm); >>>> >>>> - if (HAVE_JOB(privflags) && >>>> - qemuDomainObjBeginJob(driver, vm, QEMU_JOB_QUERY) == 0) >>>> - domflags |= QEMU_DOMAIN_STATS_HAVE_JOB; >>> >>> Existing, but if BeginJob fails for some "other" reason, then one >>> wonders how much of the next steps actually happen. >> >> Not sure what do you mean. Any StatsWorker callback (from >> qemuDomainGetStats()) that needs to talk to monitor checks if grabbing >> job succeeded or not. If it didn't the callback returns immediately. You >> can see this live - just put sleep(60) right at the beginning of >> qemuAgentGetTime(), and then from one console run 'virsh domtime $dom' >> and from another one 'virsh domstats'. >> >>> Furthermore, we >>> don't clear the LastError. >>> >>>> + if (HAVE_JOB(privflags)) { >>>> + int rv; >>>> + >>>> + if (flags & VIR_CONNECT_GET_ALL_DOMAINS_STATS_BEST_EFFORT) >>>> + rv = qemuDomainObjBeginJobInstant(driver, vm, QEMU_JOB_QUERY); >>> >>> Let's assume rv = -1 for a moment and it's the last domain that caused >>> the failure - does that mean the caller then re... >> >> unfinished sen... :-) >> > > Darn those interruptions - I think in the end this was all related to > the clearing of the error message. I know I was concerned that for this > avoid wait condition that we wouldn't message. For this case, it's > handled by ignoring it, but perhaps some other future consumer would > need to know it has to message that the call failed. It would then need > to check if the last message was NULL, then generate a message. Yup. But again, this is pre-existing and deserves a separate patch. I can send it and (to motivate others to merge these) - I will do that once these are in as a follow up ;-) > > Of course then I got into the - hey wait, we don't clear the last error > in the event that we're ignoring it and naturally neglected to go back > to my first train of thought to complete my "tence" (;-)). :-D Michal -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list