Re: Too big sectors - exceeding fabric_max_sectors

"Nicholas A. Bellinger" <nab@xxxxxxxxxxxxxxx> · Fri, 16 Nov 2012 11:02:29 -0800

On Fri, 2012-11-16 at 01:55 -0800, Nicholas A. Bellinger wrote:
> On Thu, 2012-11-15 at 23:59 -0800, Nicholas A. Bellinger wrote:
> > On Thu, 2012-11-15 at 15:50 -0800, Nicholas A. Bellinger wrote:
> > > On Thu, 2012-11-15 at 21:26 +0000, Prantis, Kelsey wrote:
> > > > On 11/13/12 2:22 PM, "Nicholas A. Bellinger" <nab@xxxxxxxxxxxxxxx> wrote:
> > > > 
> > > 
> > > <SNIP>
> > > 
> > > > Hi Nicholas,
> > > > 
> > > > Sorry for the delay. The new debug output with your latest patch (and typo
> > > > adjustment) is up at ftp://ftp.whamcloud.com/uploads/lio-debug-4.txt.bz2
> > > > 
> > > 

Hi Kelsey,

<SNIP>

> One more update on this bug..
> 
> So after increasing max_sectors_kb for the virtio-blk w/ IBLOCK in KVM
> guest from default 512 to 2048 with:
> 
>   echo 2048 > /sys/block/vda/queue/max_sectors_kb
> 
> as well as bumping the hw/virtio-blk.c max_seg default in qemu code from
> 126 to 258 in order to make virtio-blk guest vdX struct block_device
> automatically register with max_segments=256 by default:
> 
> diff --git a/hw/virtio-blk.c b/hw/virtio-blk.c
> index 6f6d172..c929b6b 100644
> --- a/hw/virtio-blk.c
> +++ b/hw/virtio-blk.c
> @@ -485,7 +485,7 @@ static void virtio_blk_update_config(VirtIODevice *vdev, uint8_t *config)
>      bdrv_get_geometry(s->bs, &capacity);
>      memset(&blkcfg, 0, sizeof(blkcfg));
>      stq_raw(&blkcfg.capacity, capacity);
> -    stl_raw(&blkcfg.seg_max, 128 - 2);
> +    stl_raw(&blkcfg.seg_max, 258 - 2);
>      stw_raw(&blkcfg.cylinders, s->conf->cyls);
>      stl_raw(&blkcfg.blk_size, blk_size);
>      stw_raw(&blkcfg.min_io_size, s->conf->min_io_size / blk_size);
> 
> These two changes seem to provide a working v3.6-rc6 guest virtio-blk
> +IBLOCK setup that is (so far) passing fio write-verify against a local
> tcm_loop LUN..
> 
> After changing max_sectors_kb 512 -> 2048 for virtio-blk, the avgrq-sz
> ratio between virtio-block+tcm_loop block devices is now 4k vs. 16K (vs.
> 1K to 16K), which likely means a stack overflow somewhere in virtio-blk
> -> virtio code while processing a large (8 MB) struct request generated
> by an SCSI initiator port.
> 
> Not sure just yet if the qemu virtio-blk max_segments=126 -> 256 change
> is necessary for the work-around, but might be worthwhile if you have a
> qemu build environment setup.  Will try a bit more with max_segments=126
> later today.
> 

So during sustained testing overnight with the above changes, I still
managed to trigger another OOPs.  However this time it appears to be
pointing at virtio-blk.c code..

[ 7728.484801] BUG: unable to handle kernel paging request at 0000000200000045
[ 7728.485713] IP: [<ffffffffa006717b>] blk_done+0x51/0xf1 [virtio_blk]
[ 7728.485713] PGD 0 
[ 7728.485713] Oops: 0000 [#1] SMP 
[ 7728.485713] Modules linked in: ib_srpt ib_cm ib_sa ib_mad ib_core tcm_qla2xxx qla2xxx tcm_loop tcm_fc libfc scsi_transport_fc iscsi_target_mod target_core_pscsi target_core_file target_core_iblock target_core_mod configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT xt_tcpudp nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables x_tables ipv6 e1000 virtio_blk sd_mod sr_mod cdrom virtio_pci virtio_ring virtio ata_piix libata
[ 7728.485713] CPU 1 
[ 7728.485713] Pid: 0, comm: swapper/1 Not tainted 3.6.0-rc6+ #4 Bochs Bochs
[ 7728.485713] RIP: 0010:[<ffffffffa006717b>]  [<ffffffffa006717b>] blk_done+0x51/0xf1 [virtio_blk]
[ 7728.485713] RSP: 0018:ffff88007fc83e98  EFLAGS: 00010093
[ 7728.485713] RAX: 0000000200000001 RBX: ffff88007ad00000 RCX: 0000000000000080
[ 7728.485713] RDX: 000000000000538a RSI: 0000000000000000 RDI: ffffea0001eddc50
[ 7728.485713] RBP: 0000000000000092 R08: ffffea0001eddc58 R09: ffff88007d002700
[ 7728.485713] R10: ffffffffa00410f9 R11: fffffffffffffff0 R12: ffff88007fc83ea4
[ 7728.485713] R13: ffff88007b771478 R14: ffff88007bf197b0 R15: 0000000000000000
[ 7728.485713] FS:  0000000000000000(0000) GS:ffff88007fc80000(0000) knlGS:0000000000000000
[ 7728.485713] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 7728.485713] CR2: 0000000200000045 CR3: 00000000014e7000 CR4: 00000000000006e0
[ 7728.485713] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 7728.485713] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 7728.485713] Process swapper/1 (pid: 0, threadinfo ffff88007d392000, task ffff88007d363380)
[ 7728.485713] Stack:
[ 7728.485713]  0000002924276c47 00200001810229fc 0000000100715b34 ffff88007ae12800
[ 7728.485713]  ffff88007bf19700 0000000000000000 0000000000000029 ffffffffa0041286
[ 7728.485713]  ffff880037bb8800 ffffffff8108448c 000010e324804758 0000000000000000
[ 7728.485713] Call Trace:
[ 7728.485713]  <IRQ> 
[ 7728.485713]  [<ffffffffa0041286>] ? vring_interrupt+0x6f/0x76 [virtio_ring]
[ 7728.485713]  [<ffffffff8108448c>] ? handle_irq_event_percpu+0x2d/0x130
[ 7728.485713]  [<ffffffff810845bd>] ? handle_irq_event+0x2e/0x4c
[ 7728.485713]  [<ffffffff8108694f>] ? handle_edge_irq+0x98/0xb9
[ 7728.485713]  [<ffffffff81003aa7>] ? handle_irq+0x17/0x20
[ 7728.485713]  [<ffffffff810032da>] ? do_IRQ+0x45/0xad
[ 7728.485713]  [<ffffffff8137f72a>] ? common_interrupt+0x6a/0x6a
[ 7728.485713]  <EOI> 
[ 7728.485713]  [<ffffffff810229fc>] ? native_safe_halt+0x2/0x3
[ 7728.485713]  [<ffffffff8100887e>] ? default_idle+0x23/0x3f
[ 7728.485713]  [<ffffffff81008b02>] ? cpu_idle+0x6b/0xaa
[ 7728.485713]  [<ffffffff81379f33>] ? start_secondary+0x1f5/0x1fa
[ 7728.485713] Code: 8b b8 b0 03 00 00 e8 55 84 31 e1 48 89 c5 eb 6e 41 8a 45 28 be fb ff ff ff 3c 02 77 0a 0f b6 c0 8b 34 85 70 81 06 a0 49 8b 45 00 <8b> 50 44 83 fa 02 74 07 83 fa 07 75 31 eb 22 41 8b 55 24 89 90 
[ 7728.485713] RIP  [<ffffffffa006717b>] blk_done+0x51/0xf1 [virtio_blk]
[ 7728.485713]  RSP <ffff88007fc83e98>
[ 7728.485713] CR2: 0000000200000045
[ 7728.485713] ---[ end trace de9d8ade00a76876 ]---
[ 7728.485713] Kernel panic - not syncing: Fatal exception in interrupt

So looking at the RIP with gdb, it points into the following code that pulls
a struct virtblk_req *vbr off the virtio_ring with virtqueue_get_buf():

(gdb) list *(blk_done+0x51)
0x19f is in blk_done (drivers/block/virtio_blk.c:82).
77                      default:
78                              error = -EIO;
79                              break;
80                      }
81
82                      switch (vbr->req->cmd_type) {
83                      case REQ_TYPE_BLOCK_PC:
84                              vbr->req->resid_len = vbr->in_hdr.residual;
85                              vbr->req->sense_len = vbr->in_hdr.sense_len;
86                              vbr->req->errors = vbr->in_hdr.errors;
(gdb)

So it's starting to look pretty clear that the virtio_ring used by
virtio-blk is somehow getting messed up..  Now enabling DEBUG within
virtio_ring.ko code to try and get some more details.

virtio_ring folks (Rusty + MST CC'ed), is there any other debug code
that would be helpful to track this down..?

Thanks,

--nab

--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html