Re: [Qemu-devel] Re: irq problems after live migration with 0.12.4

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 23.05.2010 um 12:38 schrieb Michael Tokarev:

> 23.05.2010 13:55, Peter Lieven wrote:
>> Hi,
>> 
>> after live migrating ubuntu 9.10 server (2.6.31-14-server) and suse linux 10.1 (2.6.16.13-4-smp)
>> it happens sometimes that the guest runs into irq problems. i mention these 2 guest oss
>> since i have seen the error there. there are likely others around with the same problem.
>> 
>> on the host i run 2.6.33.3 (kernel+mod) and qemu-kvm 0.12.4.
>> 
>> i started a vm with:
>> /usr/bin/qemu-kvm-0.12.4  -net tap,vlan=141,script=no,downscript=no,ifname=tap0 -net nic,vlan=141,model=e1000,macaddr=52:54:00:ff:00:72   -drive file=/dev/sdb,if=ide,boot=on,cache=none,aio=native  -m 1024 -cpu qemu64,model_id='Intel(R) Xeon(R) CPU           E5430  @ 2.66GHz'  -monitor tcp:0:4001,server,nowait -vnc :1 -name 'migration-test-9-10'  -boot order=dc,menu=on  -k de  -incoming tcp:172.21.55.22:5001  -pidfile /var/run/qemu/vm-155.pid  -mem-path /hugepages -mem-prealloc  -rtc base=utc,clock=host -usb -usbdevice tablet
>> 
>> for testing i have a clean ubuntu 9.10 server 64-bit install and created a small script with fetches a dvd iso from a local server and checking md5sum in an endless loop.
>> 
>> the download performance is approx. 50MB/s on that vm.
>> 
>> to trigger the error i did several migrations of the vm throughout the last days. finally I ended up in the following oops in the guest:
>> 
>> [64442.298521] irq 10: nobody cared (try booting with the "irqpoll" option)
>> [64442.299175] Pid: 0, comm: swapper Not tainted 2.6.31-14-server #48-Ubuntu
>> [64442.299179] Call Trace:
>> [64442.299185]<IRQ>   [<ffffffff810b4b96>] __report_bad_irq+0x26/0xa0
>> [64442.299227]  [<ffffffff810b4d9c>] note_interrupt+0x18c/0x1d0
>> [64442.299232]  [<ffffffff810b5415>] handle_fasteoi_irq+0xd5/0x100
>> [64442.299244]  [<ffffffff81014bdd>] handle_irq+0x1d/0x30
>> [64442.299246]  [<ffffffff810140b7>] do_IRQ+0x67/0xe0
>> [64442.299249]  [<ffffffff810129d3>] ret_from_intr+0x0/0x11
>> [64442.299266]  [<ffffffff810b3234>] ? handle_IRQ_event+0x24/0x160
>> [64442.299269]  [<ffffffff810b529f>] ? handle_edge_irq+0xcf/0x170
>> [64442.299271]  [<ffffffff81014bdd>] ? handle_irq+0x1d/0x30
>> [64442.299273]  [<ffffffff810140b7>] ? do_IRQ+0x67/0xe0
>> [64442.299275]  [<ffffffff810129d3>] ? ret_from_intr+0x0/0x11
>> [64442.299290]  [<ffffffff81526b14>] ? _spin_unlock_irqrestore+0x14/0x20
>> [64442.299302]  [<ffffffff8133257c>] ? scsi_dispatch_cmd+0x16c/0x2d0
>> [64442.299307]  [<ffffffff8133963a>] ? scsi_request_fn+0x3aa/0x500
>> [64442.299322]  [<ffffffff8125fafc>] ? __blk_run_queue+0x6c/0x150
>> [64442.299324]  [<ffffffff8125fcbb>] ? blk_run_queue+0x2b/0x50
>> [64442.299327]  [<ffffffff8133899f>] ? scsi_run_queue+0xcf/0x2a0
>> [64442.299336]  [<ffffffff81339a0d>] ? scsi_next_command+0x3d/0x60
>> [64442.299338]  [<ffffffff8133a21b>] ? scsi_end_request+0xab/0xb0
>> [64442.299340]  [<ffffffff8133a50e>] ? scsi_io_completion+0x9e/0x4d0
>> [64442.299348]  [<ffffffff81036419>] ? default_spin_lock_flags+0x9/0x10
>> [64442.299351]  [<ffffffff8133224d>] ? scsi_finish_command+0xbd/0x130
>> [64442.299353]  [<ffffffff8133aa95>] ? scsi_softirq_done+0x145/0x170
>> [64442.299356]  [<ffffffff81264e6d>] ? blk_done_softirq+0x7d/0x90
>> [64442.299368]  [<ffffffff810651fd>] ? __do_softirq+0xbd/0x200
>> [64442.299370]  [<ffffffff810131ac>] ? call_softirq+0x1c/0x30
>> [64442.299372]  [<ffffffff81014b85>] ? do_softirq+0x55/0x90
>> [64442.299374]  [<ffffffff81064f65>] ? irq_exit+0x85/0x90
>> [64442.299376]  [<ffffffff810140c0>] ? do_IRQ+0x70/0xe0
>> [64442.299379]  [<ffffffff810129d3>] ? ret_from_intr+0x0/0x11
>> [64442.299380]<EOI>   [<ffffffff810356f6>] ? native_safe_halt+0x6/0x10
>> [64442.299390]  [<ffffffff8101a20c>] ? default_idle+0x4c/0xe0
>> [64442.299395]  [<ffffffff815298f5>] ? atomic_notifier_call_chain+0x15/0x20
>> [64442.299398]  [<ffffffff81010e02>] ? cpu_idle+0xb2/0x100
>> [64442.299406]  [<ffffffff815123c6>] ? rest_init+0x66/0x70
>> [64442.299424]  [<ffffffff81838047>] ? start_kernel+0x352/0x35b
>> [64442.299427]  [<ffffffff8183759a>] ? x86_64_start_reservations+0x125/0x129
>> [64442.299429]  [<ffffffff81837698>] ? x86_64_start_kernel+0xfa/0x109
>> [64442.299433] handlers:
>> [64442.299840] [<ffffffffa0000b80>] (e1000_intr+0x0/0x190 [e1000])
>> [64442.300046] Disabling IRQ #10
> 
> See also LP bug #584131 (https://bugs.launchpad.net/bugs/584131)
> and original Debian bug#580649 (http://bugs.debian.org/580649)
> 
> Not sure if they're related...
> 
> /mjt
> 
> 

hi, thanks for the pointer.

i have seen them. the reporters of these bugs think that
the bug is caused by the virtio subsystem. at least the debian bug reporter
says it does not occur with virtio disabled.
here is no virtio involved. but, of course the cause could be the same.

i have a test platform here and i'm willing to make any modifications
to kernel, kvm-kmod, qemu-kvm or guest kernel to debug the problem.

peter


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux