Herton Ronaldo Krzesinski <herton.krzesinski@xxxxxxxxxxxxx> wrote: >Hi, > >Recently I received a report about crashes with mxb v4l driver >(https://bugs.launchpad.net/ubuntu/+source/linux/+bug/745213), there >are >two slightly different reported crashes which I paste here: > >1) BootDmesg.txt crash: > >[ 21.210232] general protection fault: 0000 [#1] SMP >[ 21.210292] last sysfs file: /sys/bus/i2c/drivers/tda9840/uevent >[ 21.210357] CPU 0 >[ 21.210379] Modules linked in: tda9840 tea6415c tea6420 >snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec >saa7115 mxb(+) snd_seq_midi snd_hwdep snd_rawmidi snd_pcm >snd_seq_midi_event snd_seq snd_timer snd_seq_device saa7146_vv saa7146 >snd videobuf_dma_sg videobuf_core v4l2_common videodev edac_core >soundcore shpchp snd_page_alloc sp5100_tco edac_mce_amd xhci_hcd >v4l2_compat_ioctl32 k10temp i2c_piix4 asus_atk0110 lp parport radeon >usbhid ttm hid drm_kms_helper drm ahci sym53c8xx firewire_ohci e1000 >scsi_transport_spi r8169 pata_atiixp firewire_core i2c_algo_bit libahci >crc_itu_t pata_via >[ 21.211030] >[ 21.211047] Pid: 812, comm: work_for_cpu Not tainted >2.6.38-7-generic #39-Ubuntu System manufacturer System Product >Name/M4A88TD-V EVO/USB3 >[ 21.211187] RIP: 0010:[<ffffffffa029e745>] [<ffffffffa029e745>] >v4l2_device_register_subdev+0x95/0x170 [videodev] >[ 21.211304] RSP: 0018:ffff880209de5d10 EFLAGS: 00010202 >[ 21.211362] RAX: 6564656572662e67 RBX: 00000000ffffffea RCX: >ffff88020f38d0c8 >[ 21.211438] RDX: ffff88020ea90e40 RSI: ffff88020b5ad380 RDI: >ffff88020eb72018 >[ 21.211514] RBP: ffff880209de5d40 R08: 0000000000000000 R09: >dead000000200200 >[ 21.211590] R10: 0000000000000001 R11: 0000000000000001 R12: >ffff88020b5ad380 >[ 21.211667] R13: ffff88020eb72018 R14: ffffffffa03820e0 R15: >ffff88020eb72018 >[ 21.211743] FS: 00007f9933fa1720(0000) GS:ffff8800cfc00000(0000) >knlGS:0000000000000000 >[ 21.211831] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b >[ 21.211892] CR2: 000000000048c000 CR3: 000000020fb3b000 CR4: >00000000000006f0 >[ 21.211968] DR0: 0000000000000000 DR1: 0000000000000000 DR2: >0000000000000000 >[ 21.212045] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: >0000000000000400 >[ 21.212121] Process work_for_cpu (pid: 812, threadinfo >ffff880209de4000, task ffff88020d2716c0) >[ 21.212214] Stack: >[ 21.212236] ffff880209de5d40 ffffffff8146fc05 ffff88020b097c00 >ffff88020b5ad380 >[ 21.212318] ffff88020eb72018 ffffffffa03820e0 ffff880209de5d70 >ffffffffa028f6b5 >[ 21.212401] ffff880209de5d80 0000000000000000 0000000000000042 >ffff88020eb52010 >[ 21.212484] Call Trace: >[ 21.212513] [<ffffffff8146fc05>] ? i2c_new_device+0x135/0x1c0 >[ 21.212578] [<ffffffffa028f6b5>] >v4l2_i2c_new_subdev_board+0xf5/0x160 [v4l2_common] >[ 21.212663] [<ffffffffa028f795>] v4l2_i2c_new_subdev+0x75/0xa0 >[v4l2_common] >[ 21.215825] [<ffffffffa0333714>] mxb_probe+0x124/0x260 [mxb] >[ 21.219016] [<ffffffffa03345a3>] mxb_attach+0x33/0x1f0 [mxb] >[ 21.220214] [<ffffffffa02e24c7>] saa7146_init_one+0x887/0x13c0 >[saa7146] >[ 21.220214] [<ffffffff812fef4f>] local_pci_probe+0x5f/0xd0 >[ 21.220214] [<ffffffff8107f890>] ? do_work_for_cpu+0x0/0x30 >[ 21.220214] [<ffffffff8107f8a8>] do_work_for_cpu+0x18/0x30 >[ 21.220214] [<ffffffff81086fe6>] kthread+0x96/0xa0 >[ 21.220214] [<ffffffff8100ce24>] kernel_thread_helper+0x4/0x10 >[ 21.220214] [<ffffffff81086f50>] ? kthread+0x0/0xa0 >[ 21.220214] [<ffffffff8100ce20>] ? kernel_thread_helper+0x0/0x10 >[ 21.220214] Code: f6 74 19 41 83 3e 02 0f 84 eb 00 00 00 49 8b 86 68 >02 00 00 65 ff 00 66 66 66 66 90 49 8b 44 24 30 4d 89 6c 24 20 48 85 c0 >74 13 <48> 8b 00 48 85 c0 74 0b 4c 89 e7 ff d0 85 c0 89 c3 75 85 49 8b >[ 21.220214] RIP [<ffffffffa029e745>] >v4l2_device_register_subdev+0x95/0x170 [videodev] >[ 21.220214] RSP <ffff880209de5d10> >[ 21.260394] ---[ end trace 5adfede4ff2f2907 ]--- > >2) Another crash, OopsText.txt: > >043b >IP: [<ffffffffa014c745>] v4l2_device_register_subdev+0x95/0x170 >[videodev] >PGD 0 >Oops: 0000 [#1] SMP >last sysfs file: /sys/bus/i2c/drivers/tea6415c/uevent >CPU 0 >Modules linked in: tea6415c binfmt_misc tea6420 snd_hda_codec_hdmi >snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_pcm >snd_seq_midi snd_rawmidi saa7115 snd_seq_midi_event edac_core >edac_mce_amd snd_seq mxb(+) saa7146_vv k10temp saa7146 videobuf_dma_sg >videobuf_core asus_atk0110 v4l2_common snd_timer snd_seq_device >videodev v4l2_compat_ioctl32 snd soundcore snd_page_alloc sp5100_tco >i2c_piix4 xhci_hcd shpchp lp parport usbhid hid sym53c8xx ahci >scsi_transport_spi e1000 libahci firewire_ohci r8169 pata_via >pata_atiixp firewire_core crc_itu_t > >Pid: 589, comm: work_for_cpu Not tainted 2.6.38-7-generic #39-Ubuntu >System manufacturer System Product Name/M4A88TD-V EVO/USB3 >RIP: 0010:[<ffffffffa014c745>] [<ffffffffa014c745>] >v4l2_device_register_subdev+0x95/0x170 [videodev] >RSP: 0018:ffff88020e8dfd10 EFLAGS: 00010202 >RAX: 000000000000043b RBX: 00000000ffffffea RCX: ffff88020e889908 >RDX: ffff88020aeeb240 RSI: ffff88020f5a8200 RDI: ffff88020ff61a18 >RBP: ffff88020e8dfd40 R08: 0000000000000000 R09: 0000000000000002 >R10: 0000000000000000 R11: 0000000000000004 R12: ffff88020f5a8200 >R13: ffff88020ff61a18 R14: ffffffffa01f10e0 R15: ffff88020ff61a18 >FS: 00007f0bb556e720(0000) GS:ffff8800cfc00000(0000) >knlGS:0000000000000000 >CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b >CR2: 000000000000043b CR3: 0000000001a03000 CR4: 00000000000006f0 >DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 >Process work_for_cpu (pid: 589, threadinfo ffff88020e8de000, task >ffff88020e8cdb00) >Stack: > ffff88020e8dfd40 ffffffff8146fc05 ffff88020691b800 ffff88020f5a8200 > ffff88020ff61a18 ffffffffa01f10e0 ffff88020e8dfd70 ffffffffa01726b5 > ffff88020e8dfd80 0000000000000000 0000000000000043 ffff88020dac6010 >Call Trace: > [<ffffffff8146fc05>] ? i2c_new_device+0x135/0x1c0 >[<ffffffffa01726b5>] v4l2_i2c_new_subdev_board+0xf5/0x160 [v4l2_common] > [<ffffffffa0172795>] v4l2_i2c_new_subdev+0x75/0xa0 [v4l2_common] > [<ffffffffa01d06f3>] mxb_probe+0x103/0x260 [mxb] > [<ffffffffa01d15a3>] mxb_attach+0x33/0x1f0 [mxb] > [<ffffffffa01924c7>] saa7146_init_one+0x887/0x13c0 [saa7146] > [<ffffffff812fef4f>] local_pci_probe+0x5f/0xd0 > [<ffffffff8107f890>] ? do_work_for_cpu+0x0/0x30 > [<ffffffff8107f8a8>] do_work_for_cpu+0x18/0x30 > [<ffffffff81086fe6>] kthread+0x96/0xa0 > [<ffffffff8100ce24>] kernel_thread_helper+0x4/0x10 > [<ffffffff81086f50>] ? kthread+0x0/0xa0 > [<ffffffff8100ce20>] ? kernel_thread_helper+0x0/0x10 >Code: f6 74 19 41 83 3e 02 0f 84 eb 00 00 00 49 8b 86 68 02 00 00 65 ff >00 66 66 66 66 90 49 8b 44 24 30 4d 89 6c 24 20 48 85 c0 74 13 <48> 8b >00 48 85 c0 74 0b 4c 89 e7 ff d0 85 c0 89 c3 75 85 49 8b >RIP [<ffffffffa014c745>] v4l2_device_register_subdev+0x95/0x170 >[videodev] > RSP <ffff88020e8dfd10> >CR2: 000000000000043b >---[ end trace f6215d41cb05d370 ]--- > > >The crashs are on same place, v4l2_device_register_subdev+0x95/0x170 > >Using the debug symbols of kernel above [1] and source [2], we can see: >(gdb) l *(v4l2_device_register_subdev+0x95) >0x6775 is in v4l2_device_register_subdev >(/build/buildd/linux-2.6.38/drivers/media/video/v4l2-device.c:132). >127 /* Warn if we apparently re-register a subdev */ >128 WARN_ON(sd->v4l2_dev != NULL); >129 if (!try_module_get(sd->owner)) >130 return -ENODEV; >131 sd->v4l2_dev = v4l2_dev; >132 if (sd->internal_ops && sd->internal_ops->registered) { >133 err = sd->internal_ops->registered(sd); >134 if (err) >135 return err; >136 } > >So the crash points out to be in dereference of sd->internal_ops, as if >it was sd it would likely crash earlier as sd is used previously in the >code. > >And indeed if we look at decodecode of the oops, it matches >sd->internal_ops->registered dereference where the crash happens: > > 0: f6 74 19 41 divb 0x41(%rcx,%rbx,1) > 4: 83 3e 02 cmpl $0x2,(%rsi) > 7: 0f 84 eb 00 00 00 je 0xf8 > d: 49 8b 86 68 02 00 00 mov 0x268(%r14),%rax > 14: 65 ff 00 incl %gs:(%rax) > 17: 66 66 66 66 90 data32 data32 data32 xchg %ax,%ax > 1c: 49 8b 44 24 30 mov 0x30(%r12),%rax > 21: 4d 89 6c 24 20 mov %r13,0x20(%r12) > 26: 48 85 c0 test %rax,%rax > 29: 74 13 je 0x3e >2b:* 48 8b 00 mov (%rax),%rax <-- trapping >instruction > 2e: 48 85 c0 test %rax,%rax > 31: 74 0b je 0x3e > 33: 4c 89 e7 mov %r12,%rdi > 36: ff d0 callq *%rax > 38: 85 c0 test %eax,%eax > 3a: 89 c3 mov %eax,%ebx > 3c: 75 85 jne 0xffffffffffffffc3 > 3e: 49 rex.WB > 3f: 8b .byte 0x8b > >(gdb) p &((struct v4l2_subdev *)0)->internal_ops >$1 = (const struct v4l2_subdev_internal_ops **) 0x30 >(gdb) p &((struct v4l2_subdev_internal_ops *)0)->registered >$2 = (int (**)(struct v4l2_subdev *)) 0x0 > >So it dereferences and tests successfuly sd->internal_ops: >mov 0x30(%r12),%rax >... >test %rax,%rax >and when it tries to dereference sd->internal_ops->registered it >crashes: >mov (%rax),%rax > >But looking at oopses above, look that RAX in each case has a different >value, in first it tries to dereference 0x6564656572662e67, in the >other >it is 0x43b instead, so random values in sd->internal_ops > >Now wonder why internal_ops got random values. Well, looking back in >the >traces, specially at mxb_probe where each crash happens, we can see: > >* for the first oops: >(gdb) l *(mxb_probe+0x124) >0x744 is in mxb_probe >(/build/buildd/linux-2.6.38/drivers/media/video/mxb.c:197). >192 "tea6420", I2C_TEA6420_2, NULL); >193 mxb->tea6415c = v4l2_i2c_new_subdev(&dev->v4l2_dev, >&mxb->i2c_adapter, >194 "tea6415c", I2C_TEA6415C, NULL); >195 mxb->tda9840 = v4l2_i2c_new_subdev(&dev->v4l2_dev, >&mxb->i2c_adapter, >196 "tda9840", I2C_TDA9840, NULL); >197 mxb->tuner = v4l2_i2c_new_subdev(&dev->v4l2_dev, >&mxb->i2c_adapter, >198 "tuner", I2C_TUNER, NULL); >199 >200 /* check if all devices are present */ >201 if (!mxb->tea6420_1 || !mxb->tea6420_2 || !mxb->tea6415c || > >* for the second oops: >(gdb) l *(mxb_probe+0x103) >0x723 is in mxb_probe >(/build/buildd/linux-2.6.38/drivers/media/video/mxb.c:195). >190 "tea6420", I2C_TEA6420_1, NULL); >191 mxb->tea6420_2 = v4l2_i2c_new_subdev(&dev->v4l2_dev, >&mxb->i2c_adapter, >192 "tea6420", I2C_TEA6420_2, NULL); >193 mxb->tea6415c = v4l2_i2c_new_subdev(&dev->v4l2_dev, >&mxb->i2c_adapter, >194 "tea6415c", I2C_TEA6415C, NULL); >195 mxb->tda9840 = v4l2_i2c_new_subdev(&dev->v4l2_dev, >&mxb->i2c_adapter, >196 "tda9840", I2C_TDA9840, NULL); >197 mxb->tuner = v4l2_i2c_new_subdev(&dev->v4l2_dev, >&mxb->i2c_adapter, >198 "tuner", I2C_TUNER, NULL); >199 > >Hmm the crash seems to be at random, in the cases above probably >when getting tea6415c/tda9840. > >Looking at the code then, notice how sd (struct v4l2_subdev *) is >allocated, for example in tda9840_probe we have >sd = kmalloc(sizeof(struct v4l2_subdev), GFP_KERNEL); > >So it's not kzalloc, the same holds for tea6415c, its probe function >uses kmalloc as well. > >And this is why sd->internal_ops should be getting a random value. I >don't see anywhere in current code where we clear sd->internal_ops on >initialization, and as many of these tuners etc. allocate using just >kmalloc, we get random data in sd. > >A fix could be to drivers allocating/zeroing sd using kzalloc/memset >before use. > >But seems current v4l code assumes drivers can use kmalloc, and >initialization is done in v4l2_subdev_init, so in a reply to this I >propose a patch to initialize sd->internal_ops to null in >v4l2_subdev_init, which should fix these random crashes. > >[1] >http://ddebs.ubuntu.com/pool/main/l/linux/linux-image-2.6.38-7-generic-dbgsym_2.6.38-7.39_amd64.ddeb >[2] git://kernel.ubuntu.com/ubuntu/ubuntu-natty.git > (pointing to Ubuntu-2.6.38-7.39 tag checkout) > >-- >[]'s >Herton >-- >To unsubscribe from this list: send the line "unsubscribe linux-media" >in >the body of a message to majordomo@xxxxxxxxxxxxxxx >More majordomo info at http://vger.kernel.org/majordomo-info.html The one RAX register lloks like it has ASCII bytes in it. Sorry I can't look at these futher myself. I have urgent family needs this weekend. Regards, Andy -- To unsubscribe from this list: send the line "unsubscribe linux-media" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html