On Tue, Jun 8, 2021 at 8:51 AM Michal Prívozník <mprivozn@xxxxxxxxxx> wrote: > > On 6/7/21 6:22 PM, Fabiano Fidêncio wrote: > > Currently `virt-host-validate` will fail whenever one of its calls fail, > > regardless of virHostValidateLevel set. > > > > This behaviour is not optimal and makes it not exactly reliable as a > > command line tool as other tools or scripts using it would have to check > > its output to figure out whether something really failed or if a warning > > was mistakenly treated as failure. > > > > With this change, the behaviour of whether to fail or not, is defined by > > the caller of those functions, based on the virHostValidateLevel passed > > to them. > > > > https://gitlab.com/libvirt/libvirt/-/issues/175 > > > > Signed-off-by: Fabiano Fidêncio <fabiano@xxxxxxxxxxxx> > > --- > > Changes since v1: > > * Replace the `goto out;` and the `out` labels by the > > `VIR_HOST_VALIDATE_FAILURE` macro > > Yeah, this opened pandora's box... Oh yeah, it looked pretty much like that when I first looked at this code over the weekend. > > > --- > > tools/virt-host-validate-common.c | 30 +++++++++++++++--------------- > > tools/virt-host-validate-common.h | 14 ++++++++++++++ > > 2 files changed, 29 insertions(+), 15 deletions(-) > > > > diff --git a/tools/virt-host-validate-common.c b/tools/virt-host-validate-common.c > > index 6dd851f07d..9412bb7514 100644 > > --- a/tools/virt-host-validate-common.c > > +++ b/tools/virt-host-validate-common.c > > @@ -142,7 +142,7 @@ int virHostValidateDeviceExists(const char *hvname, > > > > if (access(dev_name, F_OK) < 0) { > > virHostMsgFail(level, "%s", hint); > > - return -1; > > + return VIR_HOST_VALIDATE_FAILURE(level); > > } > > > > virHostMsgPass(); > > @@ -159,7 +159,7 @@ int virHostValidateDeviceAccessible(const char *hvname, > > > > if (access(dev_name, R_OK|W_OK) < 0) { > > virHostMsgFail(level, "%s", hint); > > - return -1; > > + return VIR_HOST_VALIDATE_FAILURE(level); > > } > > > > virHostMsgPass(); > > @@ -180,7 +180,7 @@ int virHostValidateNamespace(const char *hvname, > > > > if (access(nspath, F_OK) < 0) { > > virHostMsgFail(level, "%s", hint); > > - return -1; > > + return VIR_HOST_VALIDATE_FAILURE(level); > > } > > > > virHostMsgPass(); > > @@ -264,17 +264,17 @@ int virHostValidateLinuxKernel(const char *hvname, > > > > if (STRNEQ(uts.sysname, "Linux")) { > > virHostMsgFail(level, "%s", hint); > > - return -1; > > + return VIR_HOST_VALIDATE_FAILURE(level); > > } > > > > if (virParseVersionString(uts.release, &thisversion, true) < 0) { > > virHostMsgFail(level, "%s", hint); > > - return -1; > > + return VIR_HOST_VALIDATE_FAILURE(level); > > } > > > > if (thisversion < version) { > > virHostMsgFail(level, "%s", hint); > > - return -1; > > + return VIR_HOST_VALIDATE_FAILURE(level); > > Up until here the use of VIR_HOST_VALIDATE_FAILURE() is good. > > > } else { > > virHostMsgPass(); > > return 0; > > @@ -291,7 +291,7 @@ int virHostValidateCGroupControllers(const char *hvname, > > size_t i; > > > > if (virCgroupNew("/", -1, &group) < 0) > > - return -1; > > + return VIR_HOST_VALIDATE_FAILURE(level); > > But this looks somewhat suspicious. What virCgroupNew() does is it > detects controllers and their mountpoints (and return this in @group). > Upon error (e.g. unable to open /proc/mounts or /proc/self/cgroups or > failure in their parsing) a libvirt error is reported. And this reminds > me of pandora's box. Firstly, the error object is never initialized > (i.e. there's no virInitialize() call from main(), or better said before > the first function that has potential of reporting an error). This leads > to virResetError() (called from virReportError()) to free() random > pointers. > > Secondly, I'm unsure what to do in this case. I mean, caller told us > severity of this check (@level), so if we failed to initialize an > internal structure we should honour caller's wish and do/do not return > an error. BUT, with the way this code is currently written this function > exits early, not printing anything anywhere. E.g.: > > QEMU: Checking if device /dev/net/tun exists : PASS > QEMU: Checking for device assignment IOMMU support : PASS > > Whereas this is from a regular run: > > QEMU: Checking if device /dev/net/tun exists : PASS > QEMU: Checking for cgroup 'cpu' controller support : PASS > QEMU: Checking for cgroup 'cpuacct' controller support : PASS > QEMU: Checking for cgroup 'cpuset' controller support : PASS > QEMU: Checking for cgroup 'memory' controller support : PASS > QEMU: Checking for cgroup 'devices' controller support : PASS > QEMU: Checking for cgroup 'blkio' controller support : PASS > QEMU: Checking for device assignment IOMMU support : PASS > > This becomes more obvious [1] > > > > > for (i = 0; i < VIR_CGROUP_CONTROLLER_LAST; i++) { > > int flag = 1 << i; > > @@ -303,7 +303,7 @@ int virHostValidateCGroupControllers(const char *hvname, > > virHostMsgCheck(hvname, "for cgroup '%s' controller support", cg_name); > > > > if (!virCgroupHasController(group, i)) { > > - ret = -1; > > + ret = VIR_HOST_VALIDATE_FAILURE(level); > > virHostMsgFail(level, "Enable '%s' in kernel Kconfig file or " > > "mount/enable cgroup controller in your system", > > cg_name); > > @@ -320,7 +320,7 @@ int virHostValidateCGroupControllers(const char *hvname G_GNUC_UNUSED, > > virHostValidateLevel level) > > { > > virHostMsgFail(level, "%s", "This platform does not support cgroups"); > > - return -1; > > + return VIR_HOST_VALIDATE_FAILURE(level); > > } > > #endif /* !__linux__ */ > > > > @@ -354,7 +354,7 @@ int virHostValidateIOMMU(const char *hvname, > > "No ACPI DMAR table found, IOMMU either " > > "disabled in BIOS or not supported by this " > > "hardware platform"); > > - return -1; > > + return VIR_HOST_VALIDATE_FAILURE(level); > > } > > } else if (isAMD) { > > virHostMsgCheck(hvname, "%s", _("for device assignment IOMMU support")); > > @@ -366,7 +366,7 @@ int virHostValidateIOMMU(const char *hvname, > > "No ACPI IVRS table found, IOMMU either " > > "disabled in BIOS or not supported by this " > > "hardware platform"); > > - return -1; > > + return VIR_HOST_VALIDATE_FAILURE(level); > > } > > } else if (ARCH_IS_PPC64(arch)) { > > /* Empty Block */ > > @@ -385,7 +385,7 @@ int virHostValidateIOMMU(const char *hvname, > > } else { > > virHostMsgFail(level, > > "Unknown if this platform has IOMMU support"); > > - return -1; > > + return VIR_HOST_VALIDATE_FAILURE(level); > > } > > > > > > @@ -404,7 +404,7 @@ int virHostValidateIOMMU(const char *hvname, > > "Add %s to kernel cmdline arguments", bootarg); > > else > > virHostMsgFail(level, "IOMMU capability not compiled into kernel."); > > - return -1; > > + return VIR_HOST_VALIDATE_FAILURE(level); > > } > > virHostMsgPass(); > > return 0; > > @@ -468,7 +468,7 @@ int virHostValidateSecureGuests(const char *hvname, > > } > > > > if (virFileReadValueString(&cmdline, "/proc/cmdline") < 0) > > - return -1; > > + return VIR_HOST_VALIDATE_FAILURE(level); > > 1: here. IIUC, at the beginning of this function (not included in this > context) a message is printed out: > > virHostMsgCheck(hvname, "%s", _("for secure guest support")); > > but this 'return' makes it exit early (even without terminating the > line, corrupting the output). Mind that this is something that's been exposed, rather than introduced by this patch. ``` [fidencio@dentola ~]$ virt-host-validate qemu QEMU: comprobando if device /dev/kvm exists : PASA QEMU: comprobando if device /dev/kvm is accessible : PASA QEMU: comprobando if device /dev/vhost-net exists : PASA QEMU: comprobando if device /dev/net/tun exists : PASA QEMU: comprobando for cgroup 'cpu' controller support : PASA QEMU: comprobando for cgroup 'cpuacct' controller support : PASA QEMU: comprobando for cgroup 'cpuset' controller support : PASA QEMU: comprobando for cgroup 'memory' controller support : PASA QEMU: comprobando for cgroup 'devices' controller support : ADVERTENCIA (Enable 'devices' in kernel Kconfig file or mount/enable cgroup controller in your system) QEMU: comprobando for cgroup 'blkio' controller support : PASA ADVERTENCIA (Unknown if this platform has IOMMU support) -----> THIS IS ALREADY WRONG! QEMU: comprobando for secure guest support : ADVERTENCIA (Unknown if this platform has Secure Guest support) ``` Right now we've been lucky enough that the majority of the users never hit those, but those problems are existent problems. Let me send a v3, just for the sake of not adding more bugs to this bucket, so we'll only respect the caller's choice when verifying the support, but not when we fail to initialize an internal structure. This will leave the code 1:1 with what it was before, and then clean-ups can come later. What do you think? > So, long story short, I'm inclined to merge your patch and will post > some cleanups shortly. See my suggestion above. I don't have many free cycles in my hands to work on those cleanups, but I'd be super happy to review & test them in case you have the time. Best Regards, -- Fabiano Fidêncio