Re: Too big sectors - exceeding fabric_max_sectors

Rusty Russell <rusty@xxxxxxxxxxxxxxx> · Mon, 19 Nov 2012 13:49:55 +1030

"Nicholas A. Bellinger" <nab@xxxxxxxxxxxxxxx> writes:
> So during sustained testing overnight with the above changes, I still
> managed to trigger another OOPs.  However this time it appears to be
> pointing at virtio-blk.c code..
>
> [ 7728.484801] BUG: unable to handle kernel paging request at 0000000200000045
> [ 7728.485713] IP: [<ffffffffa006717b>] blk_done+0x51/0xf1 [virtio_blk]
> [ 7728.485713] PGD 0 
> [ 7728.485713] Oops: 0000 [#1] SMP 
> [ 7728.485713] Modules linked in: ib_srpt ib_cm ib_sa ib_mad ib_core tcm_qla2xxx qla2xxx tcm_loop tcm_fc libfc scsi_transport_fc iscsi_target_mod target_core_pscsi target_core_file target_core_iblock target_core_mod configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT xt_tcpudp nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables x_tables ipv6 e1000 virtio_blk sd_mod sr_mod cdrom virtio_pci virtio_ring virtio ata_piix libata
> [ 7728.485713] CPU 1 
> [ 7728.485713] Pid: 0, comm: swapper/1 Not tainted 3.6.0-rc6+ #4 Bochs Bochs
> [ 7728.485713] RIP: 0010:[<ffffffffa006717b>]  [<ffffffffa006717b>] blk_done+0x51/0xf1 [virtio_blk]
> [ 7728.485713] RSP: 0018:ffff88007fc83e98  EFLAGS: 00010093
> [ 7728.485713] RAX: 0000000200000001 RBX: ffff88007ad00000 RCX: 0000000000000080
> [ 7728.485713] RDX: 000000000000538a RSI: 0000000000000000 RDI: ffffea0001eddc50
> [ 7728.485713] RBP: 0000000000000092 R08: ffffea0001eddc58 R09: ffff88007d002700
> [ 7728.485713] R10: ffffffffa00410f9 R11: fffffffffffffff0 R12: ffff88007fc83ea4
> [ 7728.485713] R13: ffff88007b771478 R14: ffff88007bf197b0 R15: 0000000000000000
> [ 7728.485713] FS:  0000000000000000(0000) GS:ffff88007fc80000(0000) knlGS:0000000000000000
> [ 7728.485713] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [ 7728.485713] CR2: 0000000200000045 CR3: 00000000014e7000 CR4: 00000000000006e0
> [ 7728.485713] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 7728.485713] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [ 7728.485713] Process swapper/1 (pid: 0, threadinfo ffff88007d392000, task ffff88007d363380)
> [ 7728.485713] Stack:
> [ 7728.485713]  0000002924276c47 00200001810229fc 0000000100715b34 ffff88007ae12800
> [ 7728.485713]  ffff88007bf19700 0000000000000000 0000000000000029 ffffffffa0041286
> [ 7728.485713]  ffff880037bb8800 ffffffff8108448c 000010e324804758 0000000000000000
> [ 7728.485713] Call Trace:
> [ 7728.485713]  <IRQ> 
> [ 7728.485713]  [<ffffffffa0041286>] ? vring_interrupt+0x6f/0x76 [virtio_ring]
> [ 7728.485713]  [<ffffffff8108448c>] ? handle_irq_event_percpu+0x2d/0x130
> [ 7728.485713]  [<ffffffff810845bd>] ? handle_irq_event+0x2e/0x4c
> [ 7728.485713]  [<ffffffff8108694f>] ? handle_edge_irq+0x98/0xb9
> [ 7728.485713]  [<ffffffff81003aa7>] ? handle_irq+0x17/0x20
> [ 7728.485713]  [<ffffffff810032da>] ? do_IRQ+0x45/0xad
> [ 7728.485713]  [<ffffffff8137f72a>] ? common_interrupt+0x6a/0x6a
> [ 7728.485713]  <EOI> 
> [ 7728.485713]  [<ffffffff810229fc>] ? native_safe_halt+0x2/0x3
> [ 7728.485713]  [<ffffffff8100887e>] ? default_idle+0x23/0x3f
> [ 7728.485713]  [<ffffffff81008b02>] ? cpu_idle+0x6b/0xaa
> [ 7728.485713]  [<ffffffff81379f33>] ? start_secondary+0x1f5/0x1fa
> [ 7728.485713] Code: 8b b8 b0 03 00 00 e8 55 84 31 e1 48 89 c5 eb 6e 41 8a 45 28 be fb ff ff ff 3c 02 77 0a 0f b6 c0 8b 34 85 70 81 06 a0 49 8b 45 00 <8b> 50 44 83 fa 02 74 07 83 fa 07 75 31 eb 22 41 8b 55 24 89 90 
> [ 7728.485713] RIP  [<ffffffffa006717b>] blk_done+0x51/0xf1 [virtio_blk]
> [ 7728.485713]  RSP <ffff88007fc83e98>
> [ 7728.485713] CR2: 0000000200000045
> [ 7728.485713] ---[ end trace de9d8ade00a76876 ]---
> [ 7728.485713] Kernel panic - not syncing: Fatal exception in interrupt
>
> So looking at the RIP with gdb, it points into the following code that pulls
> a struct virtblk_req *vbr off the virtio_ring with virtqueue_get_buf():
>
> (gdb) list *(blk_done+0x51)
> 0x19f is in blk_done (drivers/block/virtio_blk.c:82).
> 77                      default:
> 78                              error = -EIO;
> 79                              break;
> 80                      }
> 81
> 82                      switch (vbr->req->cmd_type) {
> 83                      case REQ_TYPE_BLOCK_PC:
> 84                              vbr->req->resid_len = vbr->in_hdr.residual;
> 85                              vbr->req->sense_len = vbr->in_hdr.sense_len;
> 86                              vbr->req->errors = vbr->in_hdr.errors;
> (gdb)
>
> So it's starting to look pretty clear that the virtio_ring used by
> virtio-blk is somehow getting messed up..  Now enabling DEBUG within
> virtio_ring.ko code to try and get some more details.
>
> virtio_ring folks (Rusty + MST CC'ed), is there any other debug code
> that would be helpful to track this down..?

It looks like either vbr is complete crap, or already freed.  Let's make
sure.

Assuming this is true:
1) We have a race in the virtio_blk driver, which is corrupting the ring
   (eg. simultanous virtqueue_get_buf calls).  Locking looks pretty
   trivial here though.  DEBUG might help with this.

2) Qemu has a bug and is screwing up the ring, giving us a request
   twice.

3) The virtio_ring core has a bug.  This is least likely, though of
   course not impossible.

Here's a patch to try which should tell us what species of corruption
it is:

diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index 303779c..3e3081f 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -55,6 +55,7 @@ struct virtio_blk
 
 struct virtblk_req
 {
+	u32 magic;
 	struct list_head list;
 	struct request *req;
 	struct virtio_blk_outhdr out_hdr;
@@ -73,6 +74,11 @@ static void blk_done(struct virtqueue *vq)
 	while ((vbr = virtqueue_get_buf(vblk->vq, &len)) != NULL) {
 		int error;
 
+		if (unlikely(vbr->magic != 0x87654321)) {
+			printk("vbr bad magic: 0x%08x\n", vbr->magic);
+			continue; /* And pray... */
+		}
+
 		switch (vbr->status) {
 		case VIRTIO_BLK_S_OK:
 			error = 0;
@@ -100,6 +106,7 @@ static void blk_done(struct virtqueue *vq)
 
 		__blk_end_request_all(vbr->req, error);
 		list_del(&vbr->list);
+		vbr->magic = 0xfee1dead;
 		mempool_free(vbr, vblk->pool);
 	}
 	/* In case queue is stopped waiting for more buffers. */
@@ -117,6 +124,7 @@ static bool do_req(struct request_queue *q, struct virtio_blk *vblk,
 	if (!vbr)
 		/* When another request finishes we'll try again. */
 		return false;
+	vbr->magic = 0x11111111;
 
 	vbr->req = req;
 
@@ -179,7 +187,9 @@ static bool do_req(struct request_queue *q, struct virtio_blk *vblk,
 		}
 	}
 
+	vbr->magic = 0x87654321;
 	if (virtqueue_add_buf(vblk->vq, vblk->sg, out, in, vbr, GFP_ATOMIC)<0) {
+		vbr->magic = 0xc0ffee;
 		mempool_free(vbr, vblk->pool);
 		return false;
 	}


--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html