On Fri, Aug 02, 2013 at 08:06:58PM +0200, folkert wrote: > > A couple of questions: > > Please post the QEMU command-line from the host (ps aux | grep qemu). > > I'll post them all: > - UMTS-clone: this one works fine since it was created a weak ago > - belle: this one was fine but suddenly also showed the problem > - mauer: the problem one > > 112 4819 1 4 Jul30 ? 03:29:39 /usr/bin/kvm -S -M pc-1.1 -enable-kvm -m 1024 -smp 1,sockets=1,cores=1,threads=1 -name UMTS-clone -uuid e49502f1-0c74-2a60-99dc-7602da5ee640 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/UMTS-clone.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/dev/VGNEO/LV_V_UMTS-clone,if=none,id=drive-virtio-disk0,format=raw,cache=writeback -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -drive file=/home/folkert/ISOs/wheezy.iso,if=none,id=drive-ide0-1-0,readonly=on,format=raw -device ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -netdev tap,fd=20,id=hostnet0,vhost=on,vhostfd=21 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:09:3b:b6,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -vnc 127.0.0.1:0,password -vga cirrus -device usb-host,hostbus=6,hostaddr=5,id=hostdev0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 > 112 10065 1 11 Jul30 ? 07:46:16 /usr/bin/kvm -S -M pc-1.1 -enable-kvm -m 8192 -smp 12,sockets=12,cores=1,threads=1 -name belle -uuid 16b704d7-5fbd-d67b-71e6-0d6b43f1bc0a -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/belle.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/dev/VGNEO/LV_V_BELLE,if=none,id=drive-virtio-disk0,format=raw -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -drive file=/dev/VGNEO/LV_V_BELLE_OS,if=none,id=drive-virtio-disk1,format=raw,cache=writeback -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x8,drive=drive-virtio-disk1,id=virtio-disk1 -drive file=/dev/VGJOURNAL/LV_J_BELLE,if=none,id=drive-ide0-0-0,format=raw,cache=writeback -device ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -drive if=none,id=drive-ide0-1-0,readonly=on,format=raw,cache=none -device ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -netdev tap,fd=26,id=hostnet0,vhost=on,vhostfd=27 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:75:4a:6f,bus=pci.0,addr=0x3 -netdev tap,fd=28,id=hostnet1,vhost=on,vhostfd=29 -device virtio-net-pci,netdev=hostnet1,id=net1,mac=52:54:00:0a:6e:de,bus=pci.0,addr=0x7 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0 -vnc 127.0.0.1:1,password -vga cirrus -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 > root 13116 12830 0 19:54 pts/8 00:00:00 grep qemu > 112 23453 1 57 13:16 ? 03:46:51 /usr/bin/kvm -S -M pc-1.1 -enable-kvm -m 8192 -smp 8,maxcpus=12,sockets=12,cores=1,threads=1 -name mauer -uuid 3a8452e6-81af-b185-63b6-2b32be17ed87 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/mauer.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/dev/VGNEO/LV_V_MAUER,if=none,id=drive-virtio-disk0,format=raw,cache=writeback -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x8,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -drive file=/dev/VGJOURNAL/LV_J_MAUER,if=none,id=drive-virtio-disk1,format=raw,cache=writethrough -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0xa,drive=drive-virtio-disk1,id=virtio-disk1 -drive if=none,id=drive-ide0-1-0,readonly=on,format=raw -device ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -netdev tap,fd=26,id=hostnet0,vhost=on,vhostfd=27 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:86:d9:1f,bus=pci.0,addr=0x3 -netdev tap,fd=28,id=hostnet1,vhost=on,vhostfd=29 -device virtio-net-pci,netdev=hostnet1,id=net1,mac=52:54:00:a3:12:8a,bus=pci.0,addr=0x4 -netdev tap,fd=30,id=hostnet2,vhost=on,vhostfd=31 -device virtio-net-pci,netdev=hostnet2,id=net2,mac=52:54:00:0f:54:c2,bus=pci.0,addr=0x5 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0 -vnc 127.0.0.1:2,password -vga cirrus -device intel-hda,id=sound0,bus=pci.0,addr=0x7 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x9 > > Note that everything is managed via virt-manager. > > > Please confirm that vhost_net is being used on the host (lsmod | grep > > vhost_net). > > Yes, loaded and used: > > root@neo:/home/folkert# lsmod | grep vhost > vhost_net 27658 6 > macvtap 17638 1 vhost_net > tun 22479 13 vhost_net > > > Please double-check both guest and host dmesg for any suspicious > > messages. It could be about networking, out-of-memory, or kernel > > backtraces. > > I have to get back at this: I see messages about topology changes in the host but I forgot to check then if they were there when the problem started. > I *think* they appeared after I rebooted the guests but I'm not entirely sure. So let's wait on that. > That are the only messages appart from devices going into promiscues mode when I start tcpdump. Hi Folkert, If you do find something in dmesg that could be very helpful. I'm trying to put together all the data points but a few things are unclear: Based on this information it seems like a bug in the virtio_net guest driver or vhost_net on the host. Actually there is one contradictory piece of evidence: in the original bug report you said "using e1000 instead of virtio: did not help". Can you confirm that e1000 also does not work? In your original bug report you said "If I then ping any host connected to that interface, no ping comes back: only a message about buffer space not being enough". Can you post the exact error message and whether it is printed by ping inside the guest, dmesg inside the guest, or dmesg on the host? There is still the possibility that there is a networking configuration issue or bug inside the guest itself. That would explain why this has happened across different configurations (tap, mactvap, vhost_net, e1000). Two approaches to get closer to the source of the problem: 1. Try the latest vanilla kernel on the host (Linux 3.10.5). This way you can rule out fixed bugs in vhost_net or tap. 2. Get the system into the bad state and then do some deeper. Start with outgoing ping, instrument guest driver and host vhost_net functions to see what the drivers are doing, inspect the transmit vring, etc. #1 is probably the best next step. If it fails and you still have time to work on a solution we can start digging deeper with #2. Stefan -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html