But older distros/kernels work fine? Can you take a network trace? About half a year there was an issue where recent kernels had added support to start using new scsi opcodes, but the qemu functions that determine "which transfer direction is used for this opcode" had not yet been updated, so that the opcode was sent with the wrong transfer direction. That caused the guests memory to be overwritten and crash. I dont have (easy) access to the git tree right now, but it was a patch for the ATA_PASSTHROUGH command that fixed that. Could you take a network trace and check maybe this is something similar ? I.e. does the guest send an "unusual" scsi opcode just before the crash ? regards ronnie sahlberg On Tue, Oct 30, 2012 at 12:37 PM, Peter Lieven <pl@xxxxxxxxx> wrote: > Am 30.10.2012 19:27, schrieb Stefan Hajnoczi: > >> On Tue, Oct 30, 2012 at 4:56 PM, Peter Lieven <pl@xxxxxxxxx> wrote: >>> >>> On 30.10.2012 09:32, Stefan Hajnoczi wrote: >>>> >>>> On Mon, Oct 29, 2012 at 03:09:37PM +0100, Peter Lieven wrote: >>>>> >>>>> Hi, >>>> >>>> Bug subject should be virtio-blk, not virtio-scsi. virtio-scsi is a >>>> different virtio device type from virtoi-blk and is not present in the >>>> backtrace you posted. >>>> >>>> Sounds pedantic but I want to make sure this gets chalked up against the >>>> right device :). >>>> >>>>> If I try to Install Ubuntu 12.04 LTS / 12.10 64-bit on a virtio >>>>> storage backend that supports iSCSI >>>>> qemu-kvm crashes reliably with the following error: >>>> >>>> Are you using vanilla qemu-kvm-1.2.0 or are there patches applied? >>>> >>>> Have you tried qemu-kvm.git/master? >>>> >>>> Have you tried a local raw disk image to check whether libiscsi is >>>> involved? >>>> >>>>> Bad ram pointer 0x3039303620008000 >>>>> >>>>> This happens directly after the confirmation of the Timezone before >>>>> the Disk is partitioned. >>>>> >>>>> If I specify -global virtio-blk-pci.scsi=off in the cmdline this >>>>> does not happen. >>>>> >>>>> Here is a stack trace: >>>>> >>>>> Thread 1 (Thread 0x7ffff7fee700 (LWP 8226)): >>>>> #0 0x00007ffff63c0a10 in abort () from /lib/x86_64-linux-gnu/libc.so.6 >>>>> No symbol table info available. >>>>> #1 <https://github.com/sahlberg/libiscsi/issues/1> >>>>> 0x00005555557b751d in qemu_ram_addr_from_host_nofail ( >>>>> ptr=0x3039303620008000) at /usr/src/qemu-kvm-1.2.0/exec.c:2835 >>>>> ram_addr = 0 >>>>> #2 <https://github.com/sahlberg/libiscsi/issues/2> >>>>> 0x00005555557b9177 in cpu_physical_memory_unmap ( >>>>> buffer=0x3039303620008000, len=4986663671065686081, is_write=1, >>>>> access_len=1) at /usr/src/qemu-kvm-1.2.0/exec.c:3645 >>>> >>>> buffer and len are ASCII junk. It appears to be hex digits and it's not >>>> clear where they come from. >>>> >>>> It would be interesting to print *elem one stack frame up in #3 >>>> virtqueue_fill() to show the iovecs and in/out counts. >>> >>> >>> (gdb) print *elem >> >> Great, thanks for providing this info: >> >>> $6 = {index = 3, out_num = 2, in_num = 4, in_addr = {1914920960, >>> 1916656688, >>> 2024130072, 2024130088, 0 <repeats 508 times>, 4129, 93825009696000, >>> 140737328183160, 0 <repeats 509 times>}, out_addr = {2024130056, >>> 2038414056, 0, 8256, 4128, 93824999311936, 0, 3, 0 <repeats 512 >>> times>, >>> 12385, 93825009696000, 140737328183160, 0 <repeats 501 times>}, >> >> Up to here everything is fine. >> >>> in_sg = >>> {{ >>> iov_base = 0x3039303620008000, iov_len = 4986663671065686081}, { >>> iov_base = 0x3830384533334635, iov_len = 3544389261899019573}, { >> >> The fields are bogus, in_sg has been overwritten with ASCII data. >> Unfortunately I don't see any hint of where this ASCII data came from >> yet. >> >> The hdr fields you provided in stack frame #6 show that in_sg was >> overwritten during or after the bdrv_ioctl() call. We pulled valid >> data out of the vring and mapped buffers correctly. But something is >> overwriting in_sg and when we complete the request we blow up due to >> the bogus values. > > Ok. What I have to mention. I've been testing with qemu-kvm 1.2.0 > and libiscsi for a few weeks now. Its been very stable. The only thing > it blows up is during the debian/ubuntu installer. Ubuntu itself for > instance is running flawlessly. My guess is that the installer is probing > for something. The installer itself also runs flawlessly when I disable > scsi passthru with scsi=off. > >> >> Please post your full qemu-kvm command-line. > > /usr/bin/qemu-kvm-1.2.0 -net > tap,vlan=164,script=no,downscript=no,ifname=tap0 -net > nic,vlan=164,model=e1000,macaddr=52:54:00:ff:01:35 -iscsi > initiator-name=iqn.2005-03.org.virtual-core:0025b51f001c -drive > format=iscsi,file=iscsi://172.21.200.56/iqn.2001-05.com.equallogic:0-8a0906-335f4e007-d29001a3355508e8-libiscsi-test-hd0/0,if=virtio,cache=none,aio=native > -m 2048 -smp 2,sockets=1,cores=2,threads=1 -monitor > tcp:0:4002,server,nowait -vnc :2 -qmp tcp:0:3002,server,nowait -name > 'libiscsi-debug' -boot order=dc,menu=off -k de -pidfile > /var/run/qemu/vm-280.pid -mem-path /hugepages -mem-prealloc -cpu > host,+x2apic,model_id='Intel(R) Xeon(R) CPU L5640 @ 2.27GHz',-tsc > -rtc base=utc -usb -usbdevice tablet -no-hpet -vga cirrus > >> >> Please also post the exact qemu-kvm version you are using. I can see >> it's based on qemu-kvm-1.2.0 but are there any patches applied (e.g. >> distro packages may carry patches so the full package version >> information would be useful)? > > I use vanilly qemu-kvm 1.2.0 with some cherry picked patches. I will > retry with untouched qemu-kvm 1.2.0 and latest git tomorrow at latest. >> >> >> Thanks, >> Stefan > > Thank you, too > Peter -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html