On Mon, 2010-08-02 at 12:42 +0300, Avi Kivity wrote: > On 08/02/2010 12:06 PM, Avi Kivity wrote: > > I'm hitting some migration issues merging qemu.git into qemu-kvm.git: > > > > 1. Crash in mig_cancel test: > > > > (gdb) bt > > #0 0x0000003a91c83dbb in memcpy () from /lib64/libc.so.6 > > #1 0x000000000049c2ff in qemu_get_buffer (f=0x302d870, buf=<value > > optimized out>, size1=4096) at /usr/include/bits/string3.h:52 > > #2 0x0000000000409464 in ram_load (f=0x302d870, opaque=<value > > optimized out>, version_id=4) at > > /build/home/tlv/akivity/qemu-kvm/arch_init.c:407 > > #3 0x000000000049cb4c in qemu_loadvm_state (f=0x302d870) at > > savevm.c:1708 > > #4 0x0000000000494169 in process_incoming_migration (f=<value > > optimized out>) at migration.c:63 > > #5 0x0000000000494517 in tcp_accept_incoming_migration (opaque=<value > > optimized out>) at migration-tcp.c:163 > > #6 0x000000000041b67e in main_loop_wait (nonblocking=<value optimized > > out>) at /build/home/tlv/akivity/qemu-kvm/vl.c:1300 > > #7 0x00000000004314e7 in kvm_main_loop () at > > /build/home/tlv/akivity/qemu-kvm/qemu-kvm.c:1710 > > #8 0x000000000041c67f in main_loop (argc=<value optimized out>, > > argv=<value optimized out>, envp=<value optimized out>) > > at /build/home/tlv/akivity/qemu-kvm/vl.c:1340 > > #9 main (argc=<value optimized out>, argv=<value optimized out>, > > envp=<value optimized out>) at /build/home/tlv/akivity/qemu-kvm/vl.c:3069 > > > > This is on the incoming side so the test completes successfully, only > > leaving a core dump to fill my disks. > > > This appears to be > > > static inline void *host_from_stream_offset(QEMUFile *f, > > ram_addr_t offset, > > int flags) > > { > > static RAMBlock *block = NULL; > > char id[256]; > > uint8_t len; > > > > if (flags & RAM_SAVE_FLAG_CONTINUE) { > > if (!block) { > > fprintf(stderr, "Ack, bad migration stream!\n"); > > return NULL; > > } > > > > return block->host + offset; > > } > > with block == NULL, if my gdb-fu got a static variable in an inlined > function examined correctly. If block == NULL, are you getting the fprintf? > I don't see any special reason for block to be NULL on a cancelled > migration. Though perhaps the incoming stream was terminated without us > noticing, and we're migrating from some random buffer and confusing the > code? Yeah, I don't understand that either, block == NULL should only be an initial state, once we've seen a block it shouldn't happen. Does this patch solve anything: http://lists.nongnu.org/archive/html/qemu-devel/2010-07/msg01114.html I could see this fixing it if the migration was re-attempted after the cancel. Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html