Re: Latest kernel NULL pointer deref when running mke2fs

Kent Overstreet <kmo@xxxxxxxxxxxxx> · Mon, 10 Feb 2014 15:08:18 -0800



On Tue, Feb 04, 2014 at 12:04:30PM -0500, Chris Mason wrote:
> 
> [ + Kent, Jens, Neil ]
> 
> On 02/04/2014 11:09 AM, Richard W.M. Jones wrote:
> > 
> > https://bugzilla.redhat.com/show_bug.cgi?id=1061339
> > 
> > It seems to happen when mke2fs issues an ioctl, looks like it might
> > be related to TRIM/discard.
> > 
> > This is under virtualization.  The disk is backed by virtio-scsi.
> > 
> > mke2fs -t ext2 -F -b 4096 /dev/VG/LV1
> > mke2fs 1.42.9 (28-Dec-2013)
> > [   44.142483] BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
> > [   44.142483] IP: [<ffffffff8122040a>] bio_trim+0x1a/0x40
> > [   44.142483] PGD 1d193067 PUD 1d1c1067 PMD 0
> > [   44.142483] Oops: 0000 [#1] SMP
> > [   44.142483] Modules linked in: raid1 kvm_amd snd_pcsp snd_pcm kvm snd_timer snd soundcore serio_raw ata_generic pata_acpi virtio_balloon virtio_pci virtio_mmio virtio_net virtio_scsi virtio_blk virtio_console virtio_rng virtio_ring virtio ideapad_laptop sparse_keymap rfkill sym53c8xx scsi_transport_spi crc8 crc_ccitt crc32 crc_itu_t libcrc32c megaraid megaraid_sas megaraid_mbox megaraid_mm
> > [   44.142483] CPU: 0 PID: 229 Comm: mke2fs Tainted: G        W    3.14.0-0.rc1.git0.1.fc21.x86_64 #1
> > [   44.142483] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> > [   44.142483] task: ffff88001c100000 ti: ffff88001c0e4000 task.ti: ffff88001c0e4000
> > [   44.142483] RIP: 0010:[<ffffffff8122040a>]  [<ffffffff8122040a>] bio_trim+0x1a/0x40
> > [   44.142483] RSP: 0018:ffff88001c0e5b88  EFLAGS: 00000246
> > [   44.142483] RAX: ffff88001d13f020 RBX: 0000000000000000 RCX: 000000000000b690
> > [   44.142483] RDX: 0000000000008000 RSI: 0000000000000000 RDI: 0000000000000000
> > [   44.142483] RBP: ffff88001c0e5b98 R08: 00000000000174a0 R09: ffff88001f0174a0
> > [   44.142483] R10: 0000000000000000 R11: ffffea0000744fc0 R12: 0000000001000000
> > [   44.142483] R13: 0000000000000000 R14: ffff88001c0bfe80 R15: ffff88001d16df00
> > [   44.142483] FS:  00007fe89c7817c0(0000) GS:ffff88001f000000(0000) knlGS:0000000000000000
> > [   44.142483] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [   44.142483] CR2: 0000000000000028 CR3: 000000001c0e7000 CR4: 00000000000006f0
> > [   44.142483] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [   44.142483] DR3: 0000000000000000 DR6: 0000000000000000 DR7: 0000000000000000
> > [   44.142483] Stack:
> > [   44.142483]  0000000000000001 0000000000000000 ffff88001c0e5c80 ffffffffa01923f3
> > [   44.142483]  ffff88001c0e5c50 ffffc90000125040 0000000000008000 ffff88001d16df60
> > [   44.142483]  0000000000003000 ffff88001c0e5c18 ffffffff00008000 0000000000000001
> > [   44.142483] Call Trace:
> > [   44.142483]  [<ffffffffa01923f3>] make_request+0x4c3/0xcd0 [raid1]
> 
> Based on the oops, we're passing a NULL bio to bio_trim from the MD raid1 make_request.
> 
> Not really sure how we get this far, but my guess is it happens here:
> 
>                 mbio = bio_clone_mddev(bio, GFP_NOIO, mddev);
>                 bio_trim(mbio, r1_bio->sector - bio->bi_iter.bi_sector, max_sectors);
> 
> Guessing mbio is NULL because bio_clone is trying to count the iovecs.
> bio_for_each_segment expects the bvs to be setup, and since this is a
> discard bio, they are not.

Sorry for the delay, just got back. Your analysis looks correct to me - mailing
out a patch shortly
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html