On Thu, Oct 01, 2015 at 12:16:04PM +0200, Jens Axboe wrote: > On 10/01/2015 11:00 AM, Michael S. Tsirkin wrote: > >On Thu, Oct 01, 2015 at 03:10:14AM +0200, Thomas D. wrote: > >>Hi, > >> > >>I have a virtual machine which fails to boot linux-4.1.8 while mounting > >>file systems: > >> > >>>* Mounting local filesystem ... > >>>------------[ cut here ]------------ > >>>kernel BUG at drivers/block/virtio_blk.c:172! > >>>invalid opcode: 000 [#1] SMP > >>>Modules linked in: pcspkr psmouse dm_log_userspace virtio_net e1000 fuse nfs lockd grace sunrpc fscache dm_snapshot dm_bufio dm_mirror dm_region_hash dm_log usbhid usb_storage sr_mod cdrom > >>>CPU: 7 PIDL 2254 Comm: dmcrypt_write Not tainted 4.1.8-gentoo #1 > >>>Hardware name: Red Hat KVM, BIOS seabios-1.7.5-8.el7 04/01/2014 > >>>task: ffff88061fb70000 ti: ffff88061ff30000 task.ti: ffff88061ff30000 > >>>RIP: 0010:[<ffffffffb4557b30>] [<ffffffffb4557b30>] virtio_queue_rq+0x210/0x2b0 > >>>RSP: 0018:ffff88061ff33ba8 EFLAGS: 00010202 > >>>RAX: 00000000000000b1 RBX: ffff88061fb2fc00 RCX: ffff88061ff33c30 > >>>RDX: 0000000000000008 RSI: ffff88061ff33c50 RDI: ffff88061fb2fc00 > >>>RBP: ffff88061ff33bf8 R08: ffff88061eef3540 R09: ffff88061ff33c30 > >>>R10: 0000000000000000 R11: 00000000000000af R12: 0000000000000000 > >>>R13: ffff88061eef3540 R14: ffff88061eef3540 R15: ffff880622c7ca80 > >>>FS: 0000000000000000(0000) GS:ffff88063fdc0000(0000) knlGS:0000000000000000 > >>>CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > >>>CR2: 0000000001ffe468 CR3: 00000000bb343000 CR4: 00000000001406e0 > >>>Stack: > >>> ffff880622d4c478 0000000000000000 ffff88061ff33bd8 ffff88061fb2f > >>> 0000000000000001 ffff88061fb2fc00 ffff88061ff33c30 0000000000000 > >>> ffff88061eef3540 0000000000000000 ffff88061ff33c98 ffffffffb43eb > >>> > >>>Call Trace: > >>> [<ffffffffb43eb500>] __blk_mq_run_hw_queue+0x1d0/0x370 > >>> [<ffffffffb43eb315>] blk_mq_run_hw_queue+0x95/0xb0 > >>> [<ffffffffb43ec804>] blk_mq_flush_plug_list+0x129/0x140 > >>> [<ffffffffb43e33d8>] blk_finish_plug+0x18/0x50 > >>> [<ffffffffb45e3bea>] dmcrypt_write+0x1da/0x1f0 > >>> [<ffffffffb4108c90>] ? wake_up_state+0x20/0x20 > >>> [<ffffffffb45e3a10>] ? crypt_iv_lmk_dtr+0x60/0x60 > >>> [<ffffffffb40fb789>] kthread_create_on_node+0x180/0x180 > >>> [<ffffffffb4705e92>] ret_from_fork+0x42/0x70 > >>> [<ffffffffb40fb6c0>] ? kthread_create_on_node+0x180/0x180 > >>>Code: 00 0000 41 c7 85 78 01 00 00 08 00 00 00 49 c7 85 80 01 00 00 00 00 00 00 41 89 85 7c 01 00 00 e9 93 fe ff ff 66 0f 1f 44 00 00 <0f> 0b 66 0f 1f 44 00 00 49 8b 87 b0 00 00 00 41 83 e6 ef 4a 8b > >>>RIP [<ffffffffb4557b30>] virtio_queue_rq+0x210/0x2b0 > >>> RSP: <ffff88061ff33ba8> > >>>---[ end trace 8078357c459d5fc0 ]--- > > > > > >So this BUG_ON is from 1cf7e9c68fe84248174e998922b39e508375e7c1. > > commit 1cf7e9c68fe84248174e998922b39e508375e7c1 > > Author: Jens Axboe <axboe@xxxxxxxxx> > > Date: Fri Nov 1 10:52:52 2013 -0600 > > > > virtio_blk: blk-mq support > > > > > > BUG_ON(req->nr_phys_segments + 2 > vblk->sg_elems); > > > > > >On probe, we do > > /* We can handle whatever the host told us to handle. */ > > blk_queue_max_segments(q, vblk->sg_elems-2); > > > > > >To debug this, > >maybe you can print out sg_elems at init time and when this fails, > >to make sure some kind of memory corruption > >does not change sg_elems after initialization? > > > > > >Jens, how may we get more segments than blk_queue_max_segments? > >Is driver expected to validate and drop such requests? > > The answer is that this should not happen. If the driver informs of a limit > on the number of segments, that should never be exceeded. If it does, then > it's a bug in either the SG mapping, or in the building of the request - > either the request gets built too large for some reason, or the mapping > doesn't always coalesce segments even though it should. > > The problem is that we get notified out-of-band, when we attempt to push the > request to the driver. At this point, much of the context could be lost, > like it is in your case. > > Looking at the specific virtio_blk case, it does seem that it is > checking the segment count before mapping. Does the below fix the > problem, or does the BUG_ON() still trigger? Jens, I have no idea whether this is the right thing to do, so please merge this patch directly if it makes sense. > > diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c > index 6ca35495a5be..1501701b0202 100644 > --- a/drivers/block/virtio_blk.c > +++ b/drivers/block/virtio_blk.c > @@ -169,8 +169,6 @@ static int virtio_queue_rq(struct blk_mq_hw_ctx *hctx, > int err; > bool notify = false; > > - BUG_ON(req->nr_phys_segments + 2 > vblk->sg_elems); > - > vbr->req = req; > if (req->cmd_flags & REQ_FLUSH) { > vbr->out_hdr.type = cpu_to_virtio32(vblk->vdev, VIRTIO_BLK_T_FLUSH); > @@ -203,6 +201,7 @@ static int virtio_queue_rq(struct blk_mq_hw_ctx *hctx, > > num = blk_rq_map_sg(hctx->queue, vbr->req, vbr->sg); > if (num) { > + BUG_ON(num + 2 > vblk->sg_elems); > if (rq_data_dir(vbr->req) == WRITE) > vbr->out_hdr.type |= cpu_to_virtio32(vblk->vdev, VIRTIO_BLK_T_OUT); > else > > -- > Jens Axboe _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization