On 27 November 2015 at 00:54, Christian Borntraeger <borntraeger@xxxxxxxxxx> wrote: > On 11/26/2015 09:47 PM, Christian Borntraeger wrote: >> On 11/26/2015 05:17 PM, Tyler Baker wrote: >>> Hi Christian, >>> >>> The kernelci.org bot recently has been reporting kvm guest boot >>> failures[1] on various arm64 platforms in next-20151126. The bot >>> bisected[2] the failures to the commit in -next titled "KVM: Create >>> debugfs dir and stat files for each VM". I confirmed by reverting this >>> commit on top of next-20151126 it resolves the boot issue. >>> >>> In this test case the host and guest are booted with the same kernel. >>> The host is booted over nfs, installs qemu (qemu-system arm64 2.4.0), >>> and launches a guest. The host is booting fine, but when the guest is >>> launched it errors with "Failed to retrieve host CPU features!". I >>> checked the host logs, and found an "Unable to handle kernel paging >>> request" splat[3] which occurs when the guest is attempting to start. >>> >>> I scanned the patch in question but nothing obvious jumped out at me, >>> any thoughts? >> >> Not really. >> Do you have processing running that do read the files in /sys/kernel/debug/kvm/* ? >> >> If I read the arm oops message correctly it oopsed inside >> __srcu_read_lock. there is actually nothing in there that can oops, >> except the access to the preempt count. I am just guessing right now, >> but maybe the preempt variable is no longer available (as the process >> is gone). As long as a debugfs file is open, we hold a reference to >> the kvm, which holds a reference to the mm, so the mm might be killed >> after the process. But this is supposed to work, so maybe its something >> different. An objdump of __srcu_read_lock might help. > > Hmm, the preempt thing is done in srcu_read_lock, but the crash is in > __srcu_read_lock. This function gets the srcu struct from mmu_notifier.c, > which must be present and is initialized during boot. > > > int __srcu_read_lock(struct srcu_struct *sp) > { > int idx; > > idx = READ_ONCE(sp->completed) & 0x1; > __this_cpu_inc(sp->per_cpu_ref->c[idx]); > smp_mb(); /* B */ /* Avoid leaking the critical section. */ > __this_cpu_inc(sp->per_cpu_ref->seq[idx]); > return idx; > } > > Looking at the code I have no clue why the patch does make a difference. > Can you try to get an objdump -S for__Srcu_read_lock? Using next-20151126 as the base, here is the objdump[1] I came up with for__srcu_read_lock. Cheers, Tyler [1] http://hastebin.com/bifiqobola.pl -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html