On 06/07/2013 06:14 AM, Martin Kletzander wrote: > On 05/30/2013 02:24 PM, John Ferlan wrote: >> Since commit '632f78ca' the 'virsh schedinfo <domain>' command returns: >> >> Scheduler : Unknown >> error: Requested operation is not valid: cgroup CPU controller is not mounted >> >> Prior to that change a non running domain would return: >> >> Scheduler : posix >> cpu_shares : 0 >> vcpu_period : 0 >> vcpu_quota : 0 >> emulator_period: 0 >> emulator_quota : 0 >> >> This change will result in the following: >> >> Scheduler : posix >> cpu_shares : 0 >> >> The code sequence is a 'virGetSchedulerType()' which returns the "*params" >> for a subsequent 'virDomainGetSchedulerParameters[Flags]()' call. The latter >> requires at least 1 'nparam' to be provided/returned. > > I can't find where the nparams is required to be >= 1, but that might > changed since you've posted this patch. I changed the '*nparams = 1' to > '0' and it works perfectly (returns only the scheduler type, so it's > visible that nothing is set). > The documentation for virDomainGetSchedulerParameters[Flags] has: * * Get all scheduler parameters. On input, @nparams gives the size of the * @params array; on output, @nparams gives how many slots were filled * with parameter information, which might be less but will not exceed * the input value. @nparams cannot be 0. * Also, the virDomainGetSchedulerParameters checks nparams > 0: virCheckPositiveArgGoto(*nparams, error); Te reason why virsh schedinfo printed just the scheduler type is because virsh-domain.c checks for 'nparams' before calling the virDomainGetSchedulerParametersFlags() API. >> --- >> src/qemu/qemu_driver.c | 11 ++++++++++- >> 1 file changed, 10 insertions(+), 1 deletion(-) >> >> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c >> index 4a76f14..7292149 100644 >> --- a/src/qemu/qemu_driver.c >> +++ b/src/qemu/qemu_driver.c >> @@ -6899,12 +6899,20 @@ static char *qemuDomainGetSchedulerType(virDomainPtr dom, >> } >> priv = vm->privateData; >> >> + /* Domain not running or no cgroup */ >> + if (!priv->cgroup) { >> + if (nparams) >> + *nparams = 1; >> + goto cleanup; >> + } >> + >> if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPU)) { >> virReportError(VIR_ERR_OPERATION_INVALID, >> "%s", _("cgroup CPU controller is not mounted")); >> goto cleanup; >> } >> >> + >> if (nparams) { >> rc = qemuGetCpuBWStatus(priv->cgroup); >> if (rc < 0) >> @@ -6915,11 +6923,12 @@ static char *qemuDomainGetSchedulerType(virDomainPtr dom, >> *nparams = 5; >> } >> >> +cleanup: >> ignore_value(VIR_STRDUP(ret, "posix")); >> > > You can get to here even when the domain doesn't exist and in that case > NULL should be returned to indicate an error. It would also skip an > error on problem in qemuGetCpuBWStatus(), but there is none printed, so > at least we would return the scheduler type. > Oh yeah, right duh. I was trying to cover the case where !priv->cgroup without copying that same line when "!priv->cgroup". Setting *nparams = 1 was taken from the viewpoint of the return 0 case from the [lxc|qemu}GetCpuBWStatus() calls (since cgroups aren't started). For a non running guest, I assumed that none of the 5 values really meant anything, but I can see the logic that would get/set that count once prior to vm startup and then repeatedly assume/use that value to call virGetSchedulerParametersFlags() during the lifetime of whatever is monitoring the vm, so I understand why setting it to 5 is valid/desired. Once the guest is running, if there was an issue with the [lxc|qemu]GetCPUBWStatus() call, then only the 'cpu_shares' would get returned due to checks in the GetSchedulerParametersFlags() APIs. Thus, If I'm reading the responses thus far correctly, then When cgroups aren't enabled for the domain (eg, not active): qemuDomainGetSchedulerType() would return "posix" and 5 for *nparams lxcDomainGetSchedulerType() would return "posix" and 3 for *nparams e.g., for qemu: /* Domain not running or no cgroup */ if (!priv->cgroup) { if (nparams) *nparams = 5; ignore_value(VIR_STRDUP(ret, "posix")); goto cleanup; } (and moving the ignore_value(VIR_STRDUP(ret, "posix")); before cleanup: For the [lxc|qemu]DomainGetSchedulerType() API's, the only time [lxc|qemu]GetCpuBWStatus() would be called is if the domain is active, e.g., we get past the virCgroupPathOfController() call. The [qemu|lxc]DomainGetSchedulerParametersFlags() API's need adjustments as well because currently they make the same call to [lxc|qemu]GetCpuBWStatus() regardless of whether the domain is active or not and regardless of what 'flags' are set (current, live, config). This code I was less sure of how to handle what gets returned. Each has a call such as (from qemu): if (*nparams > 1) { rc = qemuGetCpuBWStatus(priv->cgroup); if (rc < 0) goto cleanup; cpu_bw_status = !!rc; } which sets 'cpu_bw_status' based on the rc, which is set as follows: * Return 1 when CFS bandwidth is supported, 0 when CFS bandwidth is not * supported, -1 on error. For the non active case, this returns 0 and thus clears the cpu_bw_status flag which affects the code deciding how many configuration parameters to return : if (flags & VIR_DOMAIN_AFFECT_CONFIG) { shares = persistentDef->cputune.shares; if (*nparams > 1 && cpu_bw_status) { ... Again, given how I'm reading the responses, the check here for cpu_bw_status could then be removed since if the vm is not active or the "--config" flag is used, we return all 5 parameters regardless of whether the vm is running and regardless of whether the cgroup is found. Another option would be to add a check just prior that would set the cpu_bw_status = true when we don't have a cgroup and we're just getting config data, such as: if (!priv->cgroup && (flags & VIR_DOMAIN_AFFECT_CONFIG)) cpu_bw_status = true; One way or another the 'cpu_bw_status' needs to be set to true before "goto out" so that all the data points are copied into the output "params[]" array. For "current" or "live" data of a running vm with cgroups mounted, subsequent checks will return an error, 1, 3, or 5 elements depending on success or failure of live checks. Obviously, I'll post a v2, but I would prefer to get it "more right" without too much more "churn". Tks, John >> -cleanup: >> if (vm) >> virObjectUnlock(vm); >> + >> return ret; >> } >> >> > > This patch fixes the problem, but may create a new one. Also 'virsh > schedinfo --config <domain>' is different for running and shutoff > domain, but I'm willing to overlook that since it is less problematic > than what happened before. > > Martin > -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list