Oliver and/or Marc, question for y'all towards the bottom. On Wed, Feb 28, 2024, Sean Christopherson wrote: > On Wed, Feb 28, 2024, Mark Brown wrote: > > On Thu, Feb 08, 2024 at 09:48:39PM +0100, Thomas Huth wrote: > > > From: Sean Christopherson <seanjc@xxxxxxxxxx> > > > > > > Extract the code to set a vCPU's entry point out of vm_arch_vcpu_add() and > > > into a new API, vcpu_arch_set_entry_point(). Providing a separate API > > > will allow creating a KVM selftests hardness that can handle tests that > > > use different entry points for sub-tests, whereas *requiring* the entry > > > point to be specified at vCPU creation makes it difficult to create a > > > generic harness, e.g. the boilerplate setup/teardown can't easily create > > > and destroy the VM and vCPUs. > > > > With today's -next I'm seeing most of the KVM selftests failing on an > > arm64 defconfig with: > > > > # ==== Test Assertion Failure ==== > > # include/kvm_util_base.h:677: !ret > > # pid=735 tid=735 errno=9 - Bad file descriptor > > # 1 0x0000000000410937: vcpu_set_reg at kvm_util_base.h:677 (discriminator 4) > > # 2 (inlined by) vcpu_arch_set_entry_point at processor.c:370 (discriminator 4) > > # 3 0x0000000000407bab: vm_vcpu_add at kvm_util_base.h:981 > > # 4 (inlined by) __vm_create_with_vcpus at kvm_util.c:419 > > # 5 (inlined by) __vm_create_shape_with_one_vcpu at kvm_util.c:432 > > # 6 0x000000000040187b: __vm_create_with_one_vcpu at kvm_util_base.h:892 > > # 7 (inlined by) vm_create_with_one_vcpu at kvm_util_base.h:899 > > # 8 (inlined by) main at aarch32_id_regs.c:158 > > # 9 0x0000007fbcbe6dc3: ?? ??:0 > > # 10 0x0000007fbcbe6e97: ?? ??:0 > > # 11 0x0000000000401f2f: _start at ??:? > > # KVM_SET_ONE_REG failed, rc: -1 errno: 9 (Bad file descriptor) > > > > and a bisect pointed to this commit which does look plausibly relevant. > > > > Note that while this was bisected with plain arm64 defconfig and the KVM > > selftests fragment was not enabled, but enabling the KVM fragment gave > > the same result as would be expected based on the options enabled by the > > fragment. We're also seeing an alternative failure pattern where the > > tests segfault when run in a different environment, I'm also tracking > > that down but I suspect these are the same issue. > > Gah, my bad, I should have at least tested on ARM since I have easy access to > such hardware. If I can't figure out what's going wrong in the next few hours, > I'll drop this series and we can try again for 6.10. > > Sorry :-/ /facepalm The inner helper doesn't return the vCPU, and by dumb (bad) luck, selftests end up trying to use fd=0. diff --git a/tools/testing/selftests/kvm/lib/aarch64/processor.c b/tools/testing/selftests/kvm/lib/aarch64/processor.c index ed4ab29f4fad..a9eb17295be4 100644 --- a/tools/testing/selftests/kvm/lib/aarch64/processor.c +++ b/tools/testing/selftests/kvm/lib/aarch64/processor.c @@ -386,6 +386,7 @@ static struct kvm_vcpu *__aarch64_vcpu_add(struct kvm_vm *vm, uint32_t vcpu_id, aarch64_vcpu_setup(vcpu, init); vcpu_set_reg(vcpu, ARM64_CORE_REG(sp_el1), stack_vaddr + stack_size); + return vcpu; } struct kvm_vcpu *aarch64_vcpu_add(struct kvm_vm *vm, uint32_t vcpu_id, I'll squash the above and force push. In my defense, I would have caught this when build-testing, as the compiler does warn... lib/aarch64/processor.c -o /usr/local/google/home/seanjc/go/src/kernel.org/nox/tools/testing/selftests/kvm/lib/aarch64/processor.o lib/aarch64/processor.c: In function ‘__aarch64_vcpu_add’: lib/aarch64/processor.c:389:1: warning: no return statement in function returning non-void [-Wreturn-type] 389 | } | ^ At top level: cc1: note: unrecognized command-line option ‘-Wno-gnu-variable-sized-type-not-at-end’ may have been intended to silence earlier diagnostics but due to a different issue that is fixed in the kvm-arm tree[*], but not in mine, I built without -Werror and didn't see the new warn in the sea of GUEST_PRINTF warnings. Ugh, and I still can't enable -Werror, because there are unused functions in aarch64/vpmu_counter_access.c aarch64/vpmu_counter_access.c:96:20: error: unused function 'enable_counter' [-Werror,-Wunused-function] static inline void enable_counter(int idx) ^ aarch64/vpmu_counter_access.c:104:20: error: unused function 'disable_counter' [-Werror,-Wunused-function] static inline void disable_counter(int idx) ^ 2 errors generated. make: *** [Makefile:278: /usr/local/google/home/seanjc/go/src/kernel.org/nox/tools/testing/selftests/kvm/aarch64/vpmu_counter_access.o] Error 1 make: *** Waiting for unfinished jobs.... Commit 49f31cff9c533d264659356b90445023b04e10fb failed to build with 'make-clang make-arm make -j128'. Oliver/Marc, any thoughts on how you want to fix the unused function warnings? As evidenced by this goof, being able to compile with -Werror is super helpful. And another question: is there any reason to not force -Werror for selftests? [*] https://lore.kernel.org/all/20240202234603.366925-1-seanjc@xxxxxxxxxx