ARMv6 boot problem - guest stuck in genl_init at __do_softirq

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all.

Finally I am back on track getting KVM/ARM running on devices and newer
architectures.

I am running into an issue, which seems to require some knowledge of the
guest kernel, and was wondering if anyone on the list can provide quick
debugging hints. The kernel version is 2.6.27 with ARM/Qualcomm patches for
the MSM processors.

The issue
--------------
During guest boot, when the initcalls are executed, the guest never makes it
through genl_init(void) in net/netlink/genetlink.c. This happens on every
single boot boot.

Symptoms
---------------
The guest simply doesn't proceed beyond that point. No crashes of the guest
kernel or host kernel.

Analysis so far
--------------------
It seems the problem happens when the guest takes a timer IRQ during
genl_register_mc_group(...). What I think happens is that IRQs are
continously enabled before processing the first one completes, thereby
causing an infinite loop of timer IRQs.

The IRQs are re-enabled in __do_softirq(), in the course of irq_exit(). This
feels weird to me. Should this really happen before exiting the IRQ handler
completely? The stack trace from just before IRQs are re-enabled is pasted
below.

I have a hunch that it may be simply because the IRQ processing is too slow
compared to the timer ticks (due to virtualization overhead), but I'm
curious why it would only manifest at this point in the guest boot process.
Can other sub-systems somehow register for processing during IRQ handling,
which slows it up just enough to crash at this point?

If anyone can confirm/deconfirm these ideas or have other insights into this
issue it will be most welcome to ease further development.

Thanks!

Stack trace
----------------
initcall neigh_init+0x0/0x8c returned 0 after 0 msecs
calling  genl_init+0x0/0x164
<7>genl_init start
<7>genl_register_family
<7>genl_register_ops
<7>netlink_set_nonroot
<7>netlink_kernel_create
<7>genl_sock == ?
<7>genl_register_mc_group
[<c01f6e90>] (dump_stack+0x0/0x14) from [<c003c388>]
(__do_softirq+0x54/0xf0)
[<c003c334>] (__do_softirq+0x0/0xf0) from [<c003c46c>] (irq_exit+0x48/0x50)
 r6:00000006 r5:00000000 r4:c027d838
[<c003c424>] (irq_exit+0x0/0x50) from [<c0021048>]
(__exception_text_start+0x48/0x60)
[<c0021000>] (__exception_text_start+0x0/0x60) from [<c0021844>]
(__irq_svc+0x24/0xa0)
Exception stack(0xc381de5c to 0xc381dea4)
de40:
c025e265
de60: c381df1c 00000000 c027ba0c c381df1c c0288aec c0288aec 00000004
60000013
de80: 00000000 c025e265 c381df00 c381df04 c381dea4 c01f6fe4 c00385dc
60000013
dea0:
ffffffff
 r6:00000001 r5:f1000000 r4:ffffffff
[<c00385b4>] (vprintk+0x0/0x2ac) from [<c01f6fe4>] (printk+0x28/0x34)
[<c01f6fbc>] (printk+0x0/0x34) from [<c019ae48>]
(genl_register_mc_group+0x50/0x1c4)
 r3:00000000 r2:00000000 r1:c0288b38 r0:c025e265
[<c019adf8>] (genl_register_mc_group+0x0/0x1c4) from [<c0019658>]
(genl_init+0xe0/0x164)
 r8:2ead2478 r7:00000001 r6:00000000 r5:c0304c60 r4:00000000
[<c0019578>] (genl_init+0x0/0x164) from [<c00212a4>]
(do_one_initcall+0x54/0x18c)
 r5:c0019578 r4:c001d96c
[<c0021250>] (do_one_initcall+0x0/0x18c) from [<c00085b4>]
(kernel_init+0x7c/0xec)
 r8:00000000 r7:00000000 r6:00000000 r5:c001d8c4 r4:c001d96c
[<c0008538>] (kernel_init+0x0/0xec) from [<c003a344>] (do_exit+0x0/0x6f0)
 r5:00000000 r4:00000000
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://lists.cs.columbia.edu/pipermail/android-virt/attachments/20101022/d76fc3ce/attachment.html


[Index of Archives]     [Linux KVM]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux