Re: Too big sectors - exceeding fabric_max_sectors

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2012-11-15 at 15:50 -0800, Nicholas A. Bellinger wrote:
> On Thu, 2012-11-15 at 21:26 +0000, Prantis, Kelsey wrote:
> > On 11/13/12 2:22 PM, "Nicholas A. Bellinger" <nab@xxxxxxxxxxxxxxx> wrote:
> > 
> 
> <SNIP>
> 
> > Hi Nicholas,
> > 
> > Sorry for the delay. The new debug output with your latest patch (and typo
> > adjustment) is up at ftp://ftp.whamcloud.com/uploads/lio-debug-4.txt.bz2
> > 
> 
> Hi Kelsey,
> 
> Thanks alot for this new dump.  It appears you where able to trigger one
> of the extra BUG_ON(!cmd) checks added in iblock_bio_destructor():
> 

<SNIP>

> So it appears that iblock_bio_destructor()'s bio->bi_private assignment
> for the saved se_cmd pointer, and likely the leading se_cmd reference
> from the same ->bi_private in iblock_bio_done() is being cleared and/or
> stomped upon by something ahead of the callback into IBLOCK code..
> 
> Still not exactly sure what's going on here, but I'm starting to lean
> toward some type of virtio-blk w/ IBLOCK specific bug that overflow's
> something in virtio with large requests.  In any event, I'll spend some
> time with with virtio-blk + IBLOCK exports next, and try to reproduce on
> v3.7-rc code.
> 

So after spending some time with virtio-blk <-> IBLOCK export in KVM
guest this evening, I'm now able to reproduce the same OOPs you've been
observing with iscsi-target LUNs, using tcm_loop LUNs w/ explicit
max_sectors_kb=8192 settings in order to generate the largish (16384
sector * 512 byte block = 8M) requests with virtio-blk backend export.

Thus far the bug is occurring during fio write-verify tests w/
blocksize=8M with the tcm_loop SCSI LUN (/dev/sdX).  For reference,
below is the kernel log.

So I'll keep poking at this tomorrow, and will (hopefully) get this bug
identified ASAP.

Thank again Kelsey!

--nab

[ 2318.466923] BUG: unable to handle kernel NULL pointer dereference at           (null)
[ 2318.467701] IP: [<ffffffffa04dc482>] iblock_complete_cmd+0x8/0x33 [target_core_iblock]
[ 2318.467701] PGD 3792b067 PUD 7ba7b067 PMD 0 
[ 2318.467701] Oops: 0002 [#1] SMP 
[ 2318.467701] Modules linked in: ib_srpt tcm_qla2xxx tcm_loop tcm_fc iscsi_target_mod target_core_pscsi target_core_file target_core_iblock target_core_mod ib_cm ib_sa ib_mad ib_core qla2xxx libfc scsi_transport_fc configfs ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT xt_tcpudp nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables x_tables ipv6 iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi e1000 virtio_blk sr_mod cdrom sd_mod virtio_pci virtio_ring virtio ata_piix libata [last unloaded: target_core_mod]
[ 2318.467701] CPU 3 
[ 2318.467701] Pid: 8512, comm: kworker/3:33 Not tainted 3.6.0-rc6+ #4 Bochs Bochs
[ 2318.467701] RIP: 0010:[<ffffffffa04dc482>]  [<ffffffffa04dc482>] iblock_complete_cmd+0x8/0x33 [target_core_iblock]
[ 2318.467701] RSP: 0018:ffff88007aff3d10  EFLAGS: 00010282
[ 2318.467701] RAX: ffff88004a656d70 RBX: 0000000000000000 RCX: ffff88007b6035f0
[ 2318.467701] RDX: 0000000000000000 RSI: 0000000000000010 RDI: ffff88007b04ca90
[ 2318.467701] RBP: ffff88003ddba000 R08: 0000000000000000 R09: 8000000000000000
[ 2318.467701] R10: 0000000000000001 R11: 0000000000046000 R12: 0000000002698000
[ 2318.467701] R13: ffff8800379d9dc0 R14: 0000000000000000 R15: 0000000000000001
[ 2318.467701] FS:  0000000000000000(0000) GS:ffff88007fd80000(0000) knlGS:0000000000000000
[ 2318.467701] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 2318.467701] CR2: 0000000000000000 CR3: 0000000037b35000 CR4: 00000000000006e0
[ 2318.467701] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2318.467701] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 2318.467701] Process kworker/3:33 (pid: 8512, threadinfo ffff88007aff2000, task ffff88004a656d70)
[ 2318.467701] Stack:
[ 2318.467701]  ffff88007b04ca90 ffffffffa04dcd8d 0000001000000800 0000000000000800
[ 2318.467701]  0000000000000000 0000000000000000 0000000000000246 ffff88007b04ca90
[ 2318.467701]  ffff88007b04cbb0 ffff88007ad0f300 ffffffffa053a37a 0000000000000000
[ 2318.467701] Call Trace:
[ 2318.467701]  [<ffffffffa04dcd8d>] ? iblock_execute_rw+0x21f/0x278 [target_core_iblock]
[ 2318.467701]  [<ffffffffa053a37a>] ? tcm_loop_queue_status+0x96/0x96 [tcm_loop]
[ 2318.467701]  [<ffffffffa04a7103>] ? __target_execute_cmd+0x41/0x7a [target_core_mod]
[ 2318.467701]  [<ffffffffa053a8f3>] ? tcm_loop_write_pending+0x9/0x10 [tcm_loop]
[ 2318.467701]  [<ffffffffa053a37a>] ? tcm_loop_queue_status+0x96/0x96 [tcm_loop]
[ 2318.467701]  [<ffffffffa04a9181>] ? transport_generic_new_cmd+0x24f/0x2b1 [target_core_mod]
[ 2318.467701]  [<ffffffffa053a37a>] ? tcm_loop_queue_status+0x96/0x96 [tcm_loop]
[ 2318.467701]  [<ffffffffa04a9243>] ? transport_handle_cdb_direct+0x60/0x74 [target_core_mod]
[ 2318.467701]  [<ffffffff81041621>] ? process_one_work+0x1cd/0x2ae
[ 2318.467701]  [<ffffffff810436d2>] ? worker_thread+0x22d/0x32c
[ 2318.467701]  [<ffffffff810434a5>] ? manage_workers+0x254/0x254
[ 2318.467701]  [<ffffffff810434a5>] ? manage_workers+0x254/0x254
[ 2318.467701]  [<ffffffff81046c0f>] ? kthread+0x81/0x89
[ 2318.467701]  [<ffffffff81385ac4>] ? kernel_thread_helper+0x4/0x10
[ 2318.467701]  [<ffffffff81046b8e>] ? kthread_freezable_should_stop+0x53/0x53
[ 2318.467701]  [<ffffffff81385ac0>] ? gs_change+0x13/0x13
[ 2318.467701] Code: 48 4c 89 e0 5b 5d 41 5c 41 5d 41 5e 41 5f c3 48 83 ec 08 48 c7 c6 20 d5 4d a0 48 83 c4 08 e9 78 d0 fc ff 53 48 8b 9f 08 02 00 00 <f0> ff 0b 0f 94 c0 84 c0 74 1f 8b 43 04 83 f8 01 19 f6 f7 d6 83 
[ 2318.467701] RIP  [<ffffffffa04dc482>] iblock_complete_cmd+0x8/0x33 [target_core_iblock]
[ 2318.467701]  RSP <ffff88007aff3d10>
[ 2318.467701] CR2: 0000000000000000
[ 2318.531925] ---[ end trace ea423a9e4f3b8654 ]---
[ 2318.532803] BUG: unable to handle kernel paging request at fffffffffffffff8
[ 2318.533757] IP: [<ffffffff810467cb>] kthread_data+0x7/0xc
[ 2318.533757] PGD 14e9067 PUD 14ea067 PMD 0 
[ 2318.533757] Oops: 0000 [#2] SMP 
[ 2318.533757] Modules linked in: ib_srpt tcm_qla2xxx tcm_loop tcm_fc iscsi_target_mod target_core_pscsi target_core_file target_core_iblock target_core_mod ib_cm ib_sa ib_mad ib_core qla2xxx libfc scsi_transport_fc configfs ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT xt_tcpudp nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables x_tables ipv6 iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi e1000 virtio_blk sr_mod cdrom sd_mod virtio_pci virtio_ring virtio ata_piix libata [last unloaded: target_core_mod]
[ 2318.533757] CPU 3 
[ 2318.533757] Pid: 8512, comm: kworker/3:33 Tainted: G      D      3.6.0-rc6+ #4 Bochs Bochs
[ 2318.533757] RIP: 0010:[<ffffffff810467cb>]  [<ffffffff810467cb>] kthread_data+0x7/0xc
[ 2318.533757] RSP: 0018:ffff88007aff39b0  EFLAGS: 00010002
[ 2318.533757] RAX: 0000000000000000 RBX: ffff88007fd91080 RCX: 0000000000000003
[ 2318.533757] RDX: ffffffff81659220 RSI: 0000000000000003 RDI: ffff88004a656d70
[ 2318.533757] RBP: ffff88004a656d70 R08: 0000000000000400 R09: ffffffff810340bf
[ 2318.533757] R10: 0000000000000003 R11: 0000000000000000 R12: 0000000000000003
[ 2318.533757] R13: ffff88007aff3a88 R14: 0000000000000003 R15: 0000000000000001
[ 2318.533757] FS:  0000000000000000(0000) GS:ffff88007fd80000(0000) knlGS:0000000000000000
[ 2318.533757] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 2318.533757] CR2: fffffffffffffff8 CR3: 0000000037b35000 CR4: 00000000000006e0
[ 2318.533757] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2318.533757] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 2318.533757] Process kworker/3:33 (pid: 8512, threadinfo ffff88007aff2000, task ffff88004a656d70)
[ 2318.533757] Stack:
[ 2318.533757]  ffffffff81042fb7 ffff88007fd91080 ffff88004a656d70 ffff88004a657028
[ 2318.533757]  ffffffff8137e5fb ffff88004a656d70 ffff88004a656d70 0000000000011080
[ 2318.566589]  ffff88007aff3fd8 0000000000011080 ffff88007aff3fd8 0000000000011080
[ 2318.566589] Call Trace:
[ 2318.566589]  [<ffffffff81042fb7>] ? wq_worker_sleeping+0xc/0x65
[ 2318.570534]  [<ffffffff8137e5fb>] ? __schedule+0x147/0x470
[ 2318.570534]  [<ffffffff8103410b>] ? do_exit+0x7fd/0x801
[ 2318.570534]  [<ffffffff8137d7a0>] ? printk+0x4c/0x51
[ 2318.570534]  [<ffffffff813803ef>] ? oops_end+0xb1/0xb6
[ 2318.570534]  [<ffffffff81025fb3>] ? no_context+0x254/0x263
[ 2318.570534]  [<ffffffff81382545>] ? do_page_fault+0x1ad/0x34c
[ 2318.570534]  [<ffffffff811a5d4f>] ? blk_recount_segments+0x16/0x25
[ 2318.570534]  [<ffffffff811a1006>] ? drive_stat_acct+0xfb/0x137
[ 2318.578642]  [<ffffffff811a3075>] ? blk_queue_bio+0x269/0x2bf
[ 2318.578642]  [<ffffffff8137f9a5>] ? async_page_fault+0x25/0x30
[ 2318.578642]  [<ffffffffa04dc482>] ? iblock_complete_cmd+0x8/0x33 [target_core_iblock]
[ 2318.578642]  [<ffffffffa04dcd8d>] ? iblock_execute_rw+0x21f/0x278 [target_core_iblock]
[ 2318.578642]  [<ffffffffa053a37a>] ? tcm_loop_queue_status+0x96/0x96 [tcm_loop]
[ 2318.578642]  [<ffffffffa04a7103>] ? __target_execute_cmd+0x41/0x7a [target_core_mod]
[ 2318.578642]  [<ffffffffa053a8f3>] ? tcm_loop_write_pending+0x9/0x10 [tcm_loop]
[ 2318.586625]  [<ffffffffa053a37a>] ? tcm_loop_queue_status+0x96/0x96 [tcm_loop]
[ 2318.586625]  [<ffffffffa04a9181>] ? transport_generic_new_cmd+0x24f/0x2b1 [target_core_mod]
[ 2318.586625]  [<ffffffffa053a37a>] ? tcm_loop_queue_status+0x96/0x96 [tcm_loop]
[ 2318.590543]  [<ffffffffa04a9243>] ? transport_handle_cdb_direct+0x60/0x74 [target_core_mod]
[ 2318.590543]  [<ffffffff81041621>] ? process_one_work+0x1cd/0x2ae
[ 2318.590543]  [<ffffffff810436d2>] ? worker_thread+0x22d/0x32c
[ 2318.594606]  [<ffffffff810434a5>] ? manage_workers+0x254/0x254
[ 2318.594606]  [<ffffffff810434a5>] ? manage_workers+0x254/0x254
[ 2318.594606]  [<ffffffff81046c0f>] ? kthread+0x81/0x89
[ 2318.594606]  [<ffffffff81385ac4>] ? kernel_thread_helper+0x4/0x10
[ 2318.594606]  [<ffffffff81046b8e>] ? kthread_freezable_should_stop+0x53/0x53
[ 2318.594606]  [<ffffffff81385ac0>] ? gs_change+0x13/0x13
[ 2318.594606] Code: 89 e2 fe ff ff eb dc 41 bf ea ff ff ff eb 89 90 65 48 8b 04 25 40 b7 00 00 48 8b 80 60 02 00 00 8b 40 f0 c3 48 8b 87 60 02 00 00 <48> 8b 40 f8 c3 48 3b 3d b9 29 61 00 75 08 0f bf 87 22 06 00 00 
[ 2318.594606] RIP  [<ffffffff810467cb>] kthread_data+0x7/0xc
[ 2318.594606]  RSP <ffff88007aff39b0>
[ 2318.594606] CR2: fffffffffffffff8
[ 2318.594606] ---[ end trace ea423a9e4f3b8655 ]---
[ 2318.594606] Fixing recursive fault but reboot is needed!
[ 2319.498258] BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
[ 2319.499233] IP: [<ffffffffa04dc87f>] iblock_bio_destructor+0xc/0x20 [target_core_iblock]
[ 2319.499233] PGD 7bb7f067 PUD 7b066067 PMD 0 
[ 2319.499233] Oops: 0000 [#3] SMP 
[ 2319.499233] Modules linked in: ib_srpt tcm_qla2xxx tcm_loop tcm_fc iscsi_target_mod target_core_pscsi target_core_file target_core_iblock target_core_mod ib_cm ib_sa ib_mad ib_core qla2xxx libfc scsi_transport_fc configfs ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT xt_tcpudp nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables x_tables ipv6 iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi e1000 virtio_blk sr_mod cdrom sd_mod virtio_pci virtio_ring virtio ata_piix libata [last unloaded: target_core_mod]
[ 2319.499233] CPU 0 
[ 2319.499233] Pid: 0, comm: swapper/0 Tainted: G      D      3.6.0-rc6+ #4 Bochs Bochs
[ 2319.499233] RIP: 0010:[<ffffffffa04dc87f>]  [<ffffffffa04dc87f>] iblock_bio_destructor+0xc/0x20 [target_core_iblock]
[ 2319.499233] RSP: 0018:ffff88007fc03dc8  EFLAGS: 00010092
[ 2319.499233] RAX: 0000000000000000 RBX: ffff8800378f43c0 RCX: 0000000000080000
[ 2319.499233] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8800378f43c0
[ 2319.499233] RBP: ffff88007b04ca90 R08: 0000000000013d20 R09: ffff88007d002700
[ 2319.499233] R10: ffffffffa00410f9 R11: 0000000000000293 R12: 0000000000000000
[ 2319.499233] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 2319.499233] FS:  0000000000000000(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000
[ 2319.499233] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 2319.499233] CR2: 0000000000000028 CR3: 000000007b70a000 CR4: 00000000000006f0
[ 2319.499233] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2319.499233] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 2319.499233] Process swapper/0 (pid: 0, threadinfo ffffffff814dc000, task ffffffff814ef410)
[ 2319.499233] Stack:
[ 2319.499233]  ffff88000000003b ffffffffa04dc7e4 ffff88007b7e5c38 ffff8800378f43c0
[ 2319.499233]  0000000000080000 ffffffff811a0cc6 ffff880000080000 000000007af106f8
[ 2319.499233]  ffff880037894000 ffff88007b7e5c38 0000000000000000 0000000000000000
[ 2319.499233] Call Trace:
[ 2319.499233]  <IRQ> 
[ 2319.499233]  [<ffffffffa04dc7e4>] ? iblock_bio_done+0x4b/0x57 [target_core_iblock]
[ 2319.499233]  [<ffffffff811a0cc6>] ? blk_update_request+0x171/0x32b
[ 2319.499233]  [<ffffffff811a0e91>] ? blk_update_bidi_request+0x11/0x5b
[ 2319.499233]  [<ffffffff811a1b25>] ? __blk_end_bidi_request+0x10/0x2b
[ 2319.499233]  [<ffffffff811a1b5d>] ? __blk_end_request_all+0x1d/0x2a
[ 2319.499233]  [<ffffffffa00671c2>] ? blk_done+0x98/0xf1 [virtio_blk]
[ 2319.499233]  [<ffffffffa0041286>] ? vring_interrupt+0x6f/0x76 [virtio_ring]
[ 2319.499233]  [<ffffffff8108448c>] ? handle_irq_event_percpu+0x2d/0x130
[ 2319.499233]  [<ffffffff810845bd>] ? handle_irq_event+0x2e/0x4c
[ 2319.499233]  [<ffffffff8108694f>] ? handle_edge_irq+0x98/0xb9
[ 2319.499233]  [<ffffffff81003aa7>] ? handle_irq+0x17/0x20
[ 2319.499233]  [<ffffffff810032da>] ? do_IRQ+0x45/0xad
[ 2319.499233]  [<ffffffff8137f72a>] ? common_interrupt+0x6a/0x6a
[ 2319.499233]  <EOI> 
[ 2319.499233]  [<ffffffff810229fc>] ? native_safe_halt+0x2/0x3
[ 2319.499233]  [<ffffffff8100887e>] ? default_idle+0x23/0x3f
[ 2319.499233]  [<ffffffff81008b02>] ? cpu_idle+0x6b/0xaa
[ 2319.499233]  [<ffffffff81563c43>] ? start_kernel+0x347/0x352
[ 2319.499233]  [<ffffffff81563731>] ? kernel_init+0x1c3/0x1c3
[ 2319.499233]  [<ffffffff815633c3>] ? x86_64_start_kernel+0x102/0x10f
[ 2319.499233] Code: 60 73 c8 4d a0 49 c7 45 50 99 c7 4d a0 4d 89 65 00 48 83 c4 08 4c 89 e8 5b 5d 41 5c 41 5d c3 48 83 ec 08 48 8b 47 58 48 8b 40 70 <48> 8b 40 28 48 8b b0 08 02 00 00 48 83 c4 08 e9 29 59 c1 e0 55 
[ 2319.499233] RIP  [<ffffffffa04dc87f>] iblock_bio_destructor+0xc/0x20 [target_core_iblock]
[ 2319.499233]  RSP <ffff88007fc03dc8>
[ 2319.499233] CR2: 0000000000000028
[ 2319.499233] ---[ end trace ea423a9e4f3b8656 ]---
[ 2319.499233] Kernel panic - not syncing: Fatal exception in interrupt
[ 2319.499233] Shutting down cpus with NMI



--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux SCSI]     [Kernel Newbies]     [Linux SCSI Target Infrastructure]     [Share Photos]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Device Mapper]

  Powered by Linux