On 11/17/21 12:12 PM, Daniel Henrique Barboza wrote:
On 11/17/21 07:51, Peter Krempa wrote:
On Tue, Nov 16, 2021 at 21:11:50 -0300, Daniel Henrique Barboza wrote:
Commit 65b0b746b516 changed spice tests to use latest caps. Before this
change, "FLAG_REAL_CAPS" wasn't being set in testQemuInfoInitArgs(). The
absence of this flag triggered the code path inside
testCompareXMLToArgv() that executed testUpdateQEMUCaps(). This function
will update the host CPU via virQEMUCapsUpdateHostCPUModel() into
virQEMUCapsInitHostCPUModel(). In this function,
virQEMUCapsInitCPUModel() would end up updating the hostCPU inside the
qemuCaps (via virQEMUCapsProbeHostCPU()). Before the forementioned
commit, the host CPU was being defaulted to x86_64, vendor Intel, for
the 'graphics-spice-timeout' test that is using the 'pc' machine type
and 'accel=kvm'.
Today, "FLAG_REAL_CAPS" is being set because we're using the latest caps
from x86_64. This means that the whole code path mentioned above is
skipped. qemuCaps are now being loaded via virQEMUCapsLoadCache()
directly. Without the handling being done by testUpdateQEMUCaps(), the
host CPU is being retrieved later on, down below
qemuProcessCreatePretendCmdPrepare() into qemuProcessUpdateGuestCPU().
The latter will attempt to update the domain cpu and executing a
virCPUCompare with the hostCPU and def->cpu.
All this logic ended up causing a failure of the
'graphics-spice-timeout' test in ppc64 and s390x hosts. This test is
being run with KVM acceleration, and the KVM driver for ppc64 and s390x
will return a default x86_64 CPU with vendor "AMD", making
virCPUCompare() fail with the following message:
With all of this description which sounds plausible and pointing to the
root cause of tests not being stable enough when used with real caps,
the fix you are proposing seems to really just fix the symptom.
This patch itself is acceptable as is, but doesn't really fix the real
issue. If you don't plan on addressing the root cause, please:
I'll wait to see if that fixes the problem Boris was experiencing with
s390x. Meanwhile I'll dig further to see if I can detect the root cause
of the problem (e.g. can we default to an Intel CPU instead of AMD).
1) note in the commit message for this commit that it's not really
fixing the root cause, just the symptom
2) file an upstream issue with your analysis so that we don't forget
that this is still broken
If I fail to fix the root cause in a sensible time I'll end up pushing this
patch - with the adjustments you suggested - to not let the tests being
broken for too long.
It is already broken since 11/4 for all none-x86 architectures and I
think it should be fixed ASAP even if it is a patch fixing the symptoms
only.
Btw why is the sanity check of the CI only run on x86?
I'll file the bug shortly.
Thanks,
Daniel
"QEMU XML-2-ARGV graphics-spice-timeout.x86_64-latest ... libvirt: CPU
Driver error : the CPU is incompatible with host CPU: host CPU vendor
does
not match required CPU vendor Intel"
Fix this test by setting cpu check='none' and avoid the virCPUCompare()
that causes the problem for ppc64 and s390x hosts.
To resolve point 1), be more explicit that it's just a symptom fix in
this sentence.
Reported-by: Boris Fiuczynski <fiuczy@xxxxxxxxxxxxx>
Signed-off-by: Daniel Henrique Barboza <danielhb413@xxxxxxxxx>
---
Sending as a RFC because I'm not sure if this patch fixes the
problem for s390x. Boris, can you please test and see if this
fix works for you?
It is plausible it will work, there are other cases which use
match='exact' check='none' using real caps on X86 and they don't seem to
be causing problems.
--
Mit freundlichen Grüßen/Kind regards
Boris Fiuczynski
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Gregor Pillen
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294