[snip] > The thing that helped me the most in those kind of situations were > tracepoints/printks. I would certainly recomment tracepoints these days ;). > > If you add a tracepoint on every guest exit, you get a pretty clear picture > on where the guest is processing code on random points in time. That helps a > lot following issues exactly like the one you're having. If it's in a loop, > you'll certainly see that just by looking at the trace. > Tracepoints is indeed a useful utility. Thanks. Interestingly the problem went away when printing every guest exit, so I figured it had to do with timing. Instead I implemented a ring-buffer to log the tracepoint entries and made a /proc/kvm file available to retrieve the logs. Finally, I ended up debugging the rcu_synchronize() call on the guest, which is where it was halting as it turns out. The problem was a simple stupid bug on my end - my assembly world switch code wrote a full word to a 'u8' field in a C-struct, thereby overwriting an interrupt pending flag. This caused too few interrupts for the softirq callback to wake up the thread that called rcu_synchronize(). Thanks for all the help once again. -------------- next part -------------- An HTML attachment was scrubbed... URL: https://lists.cs.columbia.edu/pipermail/android-virt/attachments/20101026/fa7b7fc3/attachment.html