On Thu, 2012-11-15 at 21:26 +0000, Prantis, Kelsey wrote: > On 11/13/12 2:22 PM, "Nicholas A. Bellinger" <nab@xxxxxxxxxxxxxxx> wrote: > <SNIP> > Hi Nicholas, > > Sorry for the delay. The new debug output with your latest patch (and typo > adjustment) is up at ftp://ftp.whamcloud.com/uploads/lio-debug-4.txt.bz2 > Hi Kelsey, Thanks alot for this new dump. It appears you where able to trigger one of the extra BUG_ON(!cmd) checks added in iblock_bio_destructor(): <SNIP> [ 931.601870] IBLOCK: bio_destructor: bio: ffff88007aae9840, bi_sector: 1056888, bi_size: 0 [ 931.602809] ------------[ cut here ]------------ [ 931.603213] kernel BUG at drivers/target/target_core_iblock.c:567! [ 931.603770] invalid opcode: 0000 [#1] SMP [ 931.604144] Modules linked in: target_core_pscsi target_core_file target_core_iblock iscsi_target_mod target_core_mod ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables i2c_piix4 i2c_core joydev microcode virtio_net virtio_balloon virtio_blk [ 931.606450] CPU 0 [ 931.606745] Pid: 1066, comm: iscsi_trx Not tainted 3.6.3-1.chromatestlab5.fc17.x86_64 #1 Red Hat KVM [ 931.607626] RIP: 0010:[<ffffffffa00c3eaa>] [<ffffffffa00c3eaa>] iblock_bio_destructor+0x5a/0x60 [target_core_iblock] [ 931.608693] RSP: 0018:ffff88007fc03cb8 EFLAGS: 00010046 [ 931.609135] RAX: 0000000000000000 RBX: ffff88007aae9840 RCX: 0000000000000034 [ 931.609677] RDX: 00000000000000db RSI: 0000000000000046 RDI: 0000000000000046 [ 931.610258] RBP: ffff88007fc03cc8 R08: ffffffff81cdd100 R09: 000000000001a149 [ 931.610799] R10: 203a6f6962203a72 R11: 0000000000000000 R12: ffff88003691d520 [ 931.611406] R13: 0000000000000000 R14: 0000000000002000 R15: 0000000000002000 [ 931.611947] FS: 0000000000000000(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000 [ 931.612671] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 931.613140] CR2: 00007f4ee0193198 CR3: 000000007b08d000 CR4: 00000000000006f0 [ 931.613685] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 931.614271] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 931.614888] Process iscsi_trx (pid: 1066, threadinfo ffff8800796b2000, task ffff8800796aae20) [ 931.615640] Stack: [ 931.615803] ffff88007aae9840 ffff88003691d520 ffff88007fc03ce8 ffffffff811c39a6 [ 931.616445] ffffffff815439d0 ffff88007aae9840 ffff88007fc03d38 ffffffffa00c36f8 [ 931.617059] 0000000000208040 ffff88007fc03d40 ffffffff815442f4 0000000000000000 [ 931.617747] Call Trace: [ 931.617949] <IRQ> [ 931.618148] [<ffffffff811c39a6>] bio_put+0x36/0x40 [ 931.618667] [<ffffffff815439d0>] ? ip_rcv_finish+0x80/0x360 [ 931.619138] [<ffffffffa00c36f8>] iblock_bio_done+0xb8/0x110 [target_core_iblock] [ 931.619709] [<ffffffff815442f4>] ? ip_rcv+0x214/0x320 [ 931.620134] [<ffffffff811c24cd>] bio_endio+0x1d/0x40 [ 931.620612] [<ffffffff812b7513>] req_bio_endio.isra.53+0xa3/0xe0 [ 931.621116] [<ffffffff812b7648>] blk_update_request+0xf8/0x480 [ 931.621604] [<ffffffff81371375>] ? detach_buf+0x95/0xb0 [ 931.622038] [<ffffffff812b79f7>] blk_update_bidi_request+0x27/0xa0 [ 931.622593] [<ffffffff812bad80>] __blk_end_bidi_request+0x20/0x50 [ 931.623100] [<ffffffff812badcf>] __blk_end_request_all+0x1f/0x30 [ 931.623570] [<ffffffffa00000ba>] blk_done+0x5a/0x100 [virtio_blk] [ 931.624092] [<ffffffff81041d49>] ? kvm_clock_read+0x19/0x20 [ 931.624594] [<ffffffff81371aac>] vring_interrupt+0x3c/0xa0 [ 931.625104] [<ffffffff810e7774>] handle_irq_event_percpu+0x54/0x1f0 [ 931.625589] [<ffffffff810e7951>] handle_irq_event+0x41/0x70 [ 931.626054] [<ffffffff810ea16f>] handle_edge_irq+0x6f/0x110 [ 931.626570] [<ffffffff8101510f>] handle_irq+0xbf/0x150 [ 931.627051] [<ffffffff810646f1>] ? irq_enter+0x51/0x90 [ 931.627549] [<ffffffff81626a8a>] do_IRQ+0x5a/0xe0 [ 931.627926] [<ffffffff8161d36a>] common_interrupt+0x6a/0x6a [ 931.628432] <EOI> [ 931.628589] [<ffffffff8105d848>] ? vprintk_emit+0x188/0x4e0 [ 931.629153] [<ffffffff81612f64>] printk+0x61/0x63 [ 931.629523] [<ffffffffa00c41c8>] iblock_execute_rw+0x288/0x330 [target_core_iblock] [ 931.630150] [<ffffffffa0092730>] __target_execute_cmd+0x50/0x90 [target_core_mod] [ 931.630727] [<ffffffffa009307d>] target_execute_cmd+0xed/0x290 [target_core_mod] [ 931.631373] [<ffffffffa00ef450>] iscsi_target_rx_thread+0xdb0/0x1910 [iscsi_target_mod] [ 931.631988] [<ffffffff8101257b>] ? __switch_to+0x12b/0x420 [ 931.632506] [<ffffffffa00ee6a0>] ? iscsit_thread_get_cpumask+0xb0/0xb0 [iscsi_target_mod] [ 931.633183] [<ffffffff8107ed13>] kthread+0x93/0xa0 [ 931.633560] [<ffffffff81626144>] kernel_thread_helper+0x4/0x10 starting from the bio (ffff88007aae9840), and created by a (seemingly) normal 8K I/O with 2 SGLs: [ 931.436476] IBLOCK: sg[0]: ffff88007a94ecc0 addr: ffffea0001e7cac0 length: 4096 offset: 0 [ 931.437166] IBLOCK: bio: ffff88007aae9840 bi_sector: 1056872 bi_size: 4096 [ 931.437730] IBLOCK: sg[1]: ffff88007a94ece0 addr: ffffea0001e4be00 length: 4096 offset: 0 [ 931.438431] IBLOCK: bio: ffff88007aae9840 bi_sector: 1056872 bi_size: 8192 [ 931.472049] IBLOCK: block_lba: 262144 sgl_nents: 2 The final printk of bi_sector (1056888) in iblock_bio_destructor() looks correct from a starting bi_sector(1056872) w/ bi_size=8192, and AFAICT no bios from the log where completed with any type of exception status. So it appears that iblock_bio_destructor()'s bio->bi_private assignment for the saved se_cmd pointer, and likely the leading se_cmd reference from the same ->bi_private in iblock_bio_done() is being cleared and/or stomped upon by something ahead of the callback into IBLOCK code.. Still not exactly sure what's going on here, but I'm starting to lean toward some type of virtio-blk w/ IBLOCK specific bug that overflow's something in virtio with large requests. In any event, I'll spend some time with with virtio-blk + IBLOCK exports next, and try to reproduce on v3.7-rc code. For reference can you share what type of underlying device+LLD that your virtio-blk into KVM is using here..? Also, the output from 'head /sys/block/$DEV/queue/*' from KVM host for the backend device in question would also be helpful to compare against what virtio-blk for it's own limits. Thanks again Kelsey! --nab -- To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html