ARMv6 boot problem - guest stuck in genl_init at __do_softirq

cdall at cs.columbia.edu (Christoffer Dall) · Tue, 26 Oct 2010 17:10:17 +0200

[snip]

> The thing that helped me the most in those kind of situations were
> tracepoints/printks. I would certainly recomment tracepoints these days ;).
>
> If you add a tracepoint on every guest exit, you get a pretty clear picture
> on where the guest is processing code on random points in time. That helps a
> lot following issues exactly like the one you're having. If it's in a loop,
> you'll certainly see that just by looking at the trace.
>
Tracepoints is indeed a useful utility. Thanks. Interestingly the problem
went away when printing every guest exit, so I figured it had to do with
timing. Instead I implemented a ring-buffer to log the tracepoint entries
and made a /proc/kvm file available to retrieve the logs.

Finally, I ended up debugging the rcu_synchronize() call on the guest, which
is where it was halting as it turns out. The problem was a simple stupid bug
on my end - my assembly world switch code wrote a full word to a 'u8' field
in a C-struct, thereby overwriting an interrupt pending flag. This caused
too few interrupts for the softirq callback to wake up the thread that
called rcu_synchronize().

Thanks for all the help once again.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://lists.cs.columbia.edu/pipermail/android-virt/attachments/20101026/fa7b7fc3/attachment.html