On 06/18/2015 12:33 PM, Jan Kara wrote: > On Mon 15-06-15 09:24:55, Hans Verkuil wrote: >> This reverts commit 48b25a3a713b90988b6882d318f7c0a6bed9aabc. >> >> That commit caused two regressions. The first is a BUG: >> >> BUG: unable to handle kernel NULL pointer dereference at 0000000000000100 >> IP: [<ffffffff810d5cd0>] __lock_acquire+0x2f0/0x2070 >> PGD 0 >> Oops: 0000 [#1] PREEMPT SMP >> Modules linked in: vivid v4l2_dv_timings videobuf2_vmalloc videobuf2_memops videobuf2_core v4l2_common videodev media vmw_balloon vmw_vmci acpi_cpufreq processor button >> CPU: 0 PID: 1542 Comm: v4l2-ctl Not tainted 4.1.0-rc3-test-media #1190 >> Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/20/2014 >> task: ffff880220ce4200 ti: ffff88021d16c000 task.ti: ffff88021d16c000 >> RIP: 0010:[<ffffffff810d5cd0>] [<ffffffff810d5cd0>] __lock_acquire+0x2f0/0x2070 >> RSP: 0018:ffff88021d16f9b8 EFLAGS: 00010002 >> RAX: 0000000000000046 RBX: 0000000000000292 RCX: 0000000000000001 >> RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000100 >> RBP: ffff88021d16fa88 R08: 0000000000000001 R09: 0000000000000000 >> R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000001 >> R13: ffff880220ce4200 R14: 0000000000000100 R15: 0000000000000000 >> FS: 00007f2441e7f740(0000) GS:ffff880236e00000(0000) knlGS:0000000000000000 >> CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b >> CR2: 0000000000000100 CR3: 0000000001e0b000 CR4: 00000000001406f0 >> Stack: >> ffff88021d16fa98 ffffffff810d6543 0000000000000006 0000000000000246 >> ffff88021d16fa08 ffffffff810d532d ffff880220ce4a78 ffff880200000000 >> ffff880200000001 0000000000000000 0000000000000001 000000000093a4a0 >> Call Trace: >> [<ffffffff810d6543>] ? __lock_acquire+0xb63/0x2070 >> [<ffffffff810d532d>] ? mark_held_locks+0x6d/0xa0 >> [<ffffffff810d37a8>] ? __lock_is_held+0x58/0x80 >> [<ffffffff810d852c>] lock_acquire+0x6c/0xa0 >> [<ffffffffa039f1f6>] ? vb2_vmalloc_put_userptr+0x36/0x110 [videobuf2_vmalloc] >> [<ffffffff819b1a92>] down_read+0x42/0x60 >> [<ffffffffa039f1f6>] ? vb2_vmalloc_put_userptr+0x36/0x110 [videobuf2_vmalloc] >> [<ffffffff819af1b1>] ? mutex_lock_nested+0x2b1/0x560 >> [<ffffffffa038fdc5>] ? vb2_queue_release+0x25/0x40 [videobuf2_core] >> [<ffffffffa039f1f6>] vb2_vmalloc_put_userptr+0x36/0x110 [videobuf2_vmalloc] >> [<ffffffffa038b626>] __vb2_queue_free+0x146/0x5e0 [videobuf2_core] >> [<ffffffffa038fdd3>] vb2_queue_release+0x33/0x40 [videobuf2_core] >> [<ffffffffa038fe75>] _vb2_fop_release+0x95/0xb0 [videobuf2_core] >> [<ffffffffa038feb9>] vb2_fop_release+0x29/0x50 [videobuf2_core] >> [<ffffffffa03ad372>] vivid_fop_release+0x92/0x230 [vivid] >> [<ffffffffa0358460>] v4l2_release+0x30/0x80 [videodev] >> [<ffffffff811a51d5>] __fput+0xe5/0x200 >> [<ffffffff811a5339>] ____fput+0x9/0x10 >> [<ffffffff810a9fa4>] task_work_run+0xc4/0xf0 >> [<ffffffff8108c670>] do_exit+0x3a0/0xaf0 >> [<ffffffff819b3a9b>] ? _raw_spin_unlock_irq+0x2b/0x60 >> [<ffffffff8108e0ff>] do_group_exit+0x4f/0xe0 >> [<ffffffff8109a170>] get_signal+0x200/0x8c0 >> [<ffffffff819b14b5>] ? __mutex_unlock_slowpath+0xf5/0x240 >> [<ffffffff81002593>] do_signal+0x23/0x820 >> [<ffffffff819b1609>] ? mutex_unlock+0x9/0x10 >> [<ffffffffa0358648>] ? v4l2_ioctl+0x78/0xf0 [videodev] >> [<ffffffff819b4653>] ? int_very_careful+0x5/0x46 >> [<ffffffff810d54bd>] ? trace_hardirqs_on_caller+0x15d/0x200 >> [<ffffffff81002de0>] do_notify_resume+0x50/0x60 >> [<ffffffff819b46a6>] int_signal+0x12/0x17 >> Code: ca 81 31 c0 e8 7a e2 8c 00 e8 aa 1d 8d 00 0f 1f 44 00 00 31 db 48 81 c4 a8 00 00 00 89 d8 5b 41 5c 41 5d 41 5e 41 5f 5d c3 66 90 <49> 81 3e 40 4e 02 82 b8 00 00 00 00 44 0f 44 e0 41 83 ff 01 0f >> RIP [<ffffffff810d5cd0>] __lock_acquire+0x2f0/0x2070 >> RSP <ffff88021d16f9b8> >> CR2: 0000000000000100 >> ---[ end trace 25595c2b8560cb57 ]--- >> Fixing recursive fault but reboot is needed! > > Ah, that's tricky. We can end up calling task_work_run() via > exit_task_work() after mm has been shut down. And the task work will be > dropping the last reference to all file descriptors which ends up shutting > down vb2 after current->mm has been cleaned up. > > So in the light of this it's probably better for the initial patch to > completely avoid grabbing mmap_sem in put_userptr(). It breaks locking for > vma->vm_ops->close() but that's already broken in vb2 as I explained in my > other email. And the remainder of the patch set will make sure we don't > need mmap_sem in put_userptr() at all and thus fixes the whole issue. > > This also explains why I never saw the problem in my testing - I was always > testing the patch set as a whole. > > I'll send an updated first patch later today. OK, good. I'm thinking: if it is OK with Andrew, then the low-level mm changes can be merged for 4.2 through his tree since that doesn't affect anything else (right?), but the vb2 changes I prefer to postpone to 4.3. I'd like to give it enough time for testing and shake-out any remaining issues (hopefully there aren't any, of course). The omap_vout patch can go in for 4.3 as well (this driver might be removed in the near future, so there is no hurry there), and it's up to you what to do with the exynos drm driver. Does this make sense? Regards, Hans -- To unsubscribe from this list: send the line "unsubscribe linux-media" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html