On Tue, Jul 26 2022 at 5:44P -0400, Milan Broz <gmazyland@xxxxxxxxx> wrote: > On 26/07/2022 18:09, Mike Snitzer wrote: > > Hi, > > > > Please see this updated patchset that reflects what I've staged for > > the 5.20 merge window, see: > > https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/log/?h=dm-5.20 > > > > My only outstanding question, from previous v1 patchset, is: should > > the verify_wq be created using WQ_HIGHPRI instead of WQ_CPU_INTENSIVE? > > (I doubt it has a significant impact on performance but if you have > > insight on why you made that change, and if it meaningful, I'd happily > > apply the change). > > > > I've tested using cryptsetup's testsuite (which has dm-verity tests) > > but I haven't tested the "try_verify_in_tasklet" feature. I wasn't lying above: I haven't tested the "try_verify_in_tasklet" feature. I just figured I didn't break what Huck posted by cleaning it up while reviewing closely ;) > Hi Mike, > > I added new veritysetup option --use-tasklets for testing to a new branch > https://gitlab.com/cryptsetup/cryptsetup/-/commits/verify-tasklet > > I tried to run verity-compat-test (on that branch above), not used the flag yet, > just in one simple option flag test (see the branch). OK, thanks for doing this work, really appreciate it. How is it I would initiate this test using your "verify-tasklet" branch? (I only ever run "make check", I'm not a power user with cryptsetup's testing. So any pointers on how best to focus on dm-verity testing using the existing make targets and options would be welcomed. But I looked at your new test and if I understand it correctly: it just loads a test device with "try_verify_in_tasklet" set in the optional args. (safe to assume any IO issued to the device is a sideffect of udev scripts?) But yeah, you haven't even tried to break it yet.. and it fell over ;) > But with your patches (on top of 5.19.0-rc8) and my testing 32bit VM: > > - FEC tests are skipped even if FEC is enabled in kernel. > I think the whole FEC is broken with your patches. > (Beware, test will skip FEC quietly! Usually CONFIG_DM_VERITY_FEC is disabled, > so we have to ignore it.) with tasklets the FEC code path should always punt to the workqueue. Which is free to sleep. But the following lockdep tracing you provided seems to be concerned about the dm_bufio_get in tasklet (having locked w/ irqsave) vs dm_bufio_prefetch in workqueue locking global_spinlock with spin_lock (not _irqsave). I'm just missing the link between the two, as in: how would the dm_bufio_get directly get to dm_bufio_prefetch without first bouncing through the workqueue? (answer should always be: you can't). If the lockdep report shows how, I'm missing it (with suboptimal reading ability of lockdep traces). Oh.. I think it's just saying that since all the locking is with a spinlock, no matter the context, there is potential for issues, with inconsistent locking. It isn't saying it actually triggered the deadlock. Basically seems like false-positive (workqueue won't ever be in interrupt context). I wonder if adjust_total_allocated() should just grow some lockdep annotation to silence lockdep rather than go to the extreme of it using _irqsave (but I'll go with that first). Thanks, Mike > - Also I see this warning (in that simple option test I added). > > : ===================================================== > : WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected > : 5.19.0-rc8+ #767 Not tainted > : ----------------------------------------------------- > : kworker/u16:6/2488 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire: > : f7a38090 (global_spinlock){+.+.}-{2:2}, at: adjust_total_allocated+0x95/0x120 [dm_bufio] > : \x0aand this task is already holding: > : c555086c (&c->spinlock){..-.}-{2:2}, at: dm_bufio_lock+0x54/0x60 [dm_bufio] > : which would create a new lock dependency: > : (&c->spinlock){..-.}-{2:2} -> (global_spinlock){+.+.}-{2:2} > : \x0abut this new dependency connects a SOFTIRQ-irq-safe lock: > : (&c->spinlock){..-.}-{2:2} > : \x0a... which became SOFTIRQ-irq-safe at: > : lock_acquire+0xb2/0x2b0 > : _raw_spin_lock_irqsave_nested+0x3b/0x90 > : dm_bufio_lock+0x54/0x60 [dm_bufio] > : new_read+0x32/0x120 [dm_bufio] > : dm_bufio_get+0xd/0x10 [dm_bufio] > : verity_verify_level+0x199/0x220 [dm_verity] > : verity_hash_for_block+0x26/0xf0 [dm_verity] > : verity_verify_io+0x134/0x490 [dm_verity] > : verity_tasklet+0xf/0x7f [dm_verity] > : tasklet_action_common.constprop.0+0xd0/0xf0 > : tasklet_action+0x21/0x30 > : __do_softirq+0xb4/0x4c5 > : run_ksoftirqd+0x35/0x50 > : smpboot_thread_fn+0x174/0x230 > : kthread+0xd2/0x100 > : ret_from_fork+0x19/0x24 > : \x0ato a SOFTIRQ-irq-unsafe lock: > : (global_spinlock){+.+.}-{2:2} > : \x0a... which became SOFTIRQ-irq-unsafe at: > : ... > : lock_acquire+0xb2/0x2b0 > : _raw_spin_lock+0x28/0x70 > : adjust_total_allocated+0x95/0x120 [dm_bufio] > : __link_buffer+0xb2/0xf0 [dm_bufio] > : __bufio_new+0x20b/0x2b0 [dm_bufio] > : dm_bufio_prefetch+0x90/0x1f0 [dm_bufio] > : verity_prefetch_io+0x142/0x180 [dm_verity] > : process_one_work+0x246/0x530 > : worker_thread+0x47/0x3e0 > : kthread+0xd2/0x100 > : ret_from_fork+0x19/0x24 > : \x0aother info that might help us debug this:\x0a > : Possible interrupt unsafe locking scenario:\x0a > : CPU0 CPU1 > : ---- ---- > : lock(global_spinlock); > : local_irq_disable(); > : lock(&c->spinlock); > : lock(global_spinlock); > : <Interrupt> > : lock(&c->spinlock); > : \x0a *** DEADLOCK ***\x0a > : 3 locks held by kworker/u16:6/2488: > : #0: c55494b8 ((wq_completion)kverityd){+.+.}-{0:0}, at: process_one_work+0x1d0/0x530 > : #1: c6c49f38 ((work_completion)(&pw->work)){+.+.}-{0:0}, at: process_one_work+0x1d0/0x530 > : #2: c555086c (&c->spinlock){..-.}-{2:2}, at: dm_bufio_lock+0x54/0x60 [dm_bufio] > : \x0athe dependencies between SOFTIRQ-irq-safe lock and the holding lock: > : -> (&c->spinlock){..-.}-{2:2} ops: 2 { > : IN-SOFTIRQ-W at: > : lock_acquire+0xb2/0x2b0 > : _raw_spin_lock_irqsave_nested+0x3b/0x90 > : dm_bufio_lock+0x54/0x60 [dm_bufio] > : new_read+0x32/0x120 [dm_bufio] > : dm_bufio_get+0xd/0x10 [dm_bufio] > : verity_verify_level+0x199/0x220 [dm_verity] > : verity_hash_for_block+0x26/0xf0 [dm_verity] > : verity_verify_io+0x134/0x490 [dm_verity] > : verity_tasklet+0xf/0x7f [dm_verity] > : tasklet_action_common.constprop.0+0xd0/0xf0 > : tasklet_action+0x21/0x30 > : __do_softirq+0xb4/0x4c5 > : run_ksoftirqd+0x35/0x50 > : smpboot_thread_fn+0x174/0x230 > : kthread+0xd2/0x100 > : ret_from_fork+0x19/0x24 > : INITIAL USE at: > : lock_acquire+0xb2/0x2b0 > : _raw_spin_lock_irqsave_nested+0x3b/0x90 > : dm_bufio_lock+0x54/0x60 [dm_bufio] > : dm_bufio_prefetch+0x4b/0x1f0 [dm_bufio] > : verity_prefetch_io+0x3c/0x180 [dm_verity] > : process_one_work+0x246/0x530 > : worker_thread+0x47/0x3e0 > : kthread+0xd2/0x100 > : ret_from_fork+0x19/0x24 > : } > : ... key at: [<f7a384e8>] __key.24+0x0/0xffffeb18 [dm_bufio] > : \x0athe dependencies between the lock to be acquired > : and SOFTIRQ-irq-unsafe lock: > : -> (global_spinlock){+.+.}-{2:2} ops: 129329 { > : HARDIRQ-ON-W at: > : lock_acquire+0xb2/0x2b0 > : _raw_spin_lock+0x28/0x70 > : adjust_total_allocated+0x95/0x120 [dm_bufio] > : __link_buffer+0xb2/0xf0 [dm_bufio] > : __bufio_new+0x20b/0x2b0 [dm_bufio] > : dm_bufio_prefetch+0x90/0x1f0 [dm_bufio] > : verity_prefetch_io+0x142/0x180 [dm_verity] > : process_one_work+0x246/0x530 > : worker_thread+0x47/0x3e0 > : kthread+0xd2/0x100 > : ret_from_fork+0x19/0x24 > : SOFTIRQ-ON-W at: > : lock_acquire+0xb2/0x2b0 > : _raw_spin_lock+0x28/0x70 > : adjust_total_allocated+0x95/0x120 [dm_bufio] > : __link_buffer+0xb2/0xf0 [dm_bufio] > : __bufio_new+0x20b/0x2b0 [dm_bufio] > : dm_bufio_prefetch+0x90/0x1f0 [dm_bufio] > : verity_prefetch_io+0x142/0x180 [dm_verity] > : process_one_work+0x246/0x530 > : worker_thread+0x47/0x3e0 > : kthread+0xd2/0x100 > : ret_from_fork+0x19/0x24 > : INITIAL USE at: > : lock_acquire+0xb2/0x2b0 > : _raw_spin_lock+0x28/0x70 > : adjust_total_allocated+0x95/0x120 [dm_bufio] > : __link_buffer+0xb2/0xf0 [dm_bufio] > : __bufio_new+0x20b/0x2b0 [dm_bufio] > : dm_bufio_prefetch+0x90/0x1f0 [dm_bufio] > : verity_prefetch_io+0x142/0x180 [dm_verity] > : process_one_work+0x246/0x530 > : worker_thread+0x47/0x3e0 > : kthread+0xd2/0x100 > : ret_from_fork+0x19/0x24 > : } > : ... key at: [<f7a38090>] global_spinlock+0x10/0xffffef80 [dm_bufio] > : ... acquired at: > : lock_acquire+0xb2/0x2b0 > : _raw_spin_lock+0x28/0x70 > : adjust_total_allocated+0x95/0x120 [dm_bufio] > : __link_buffer+0xb2/0xf0 [dm_bufio] > : __bufio_new+0x20b/0x2b0 [dm_bufio] > : dm_bufio_prefetch+0x90/0x1f0 [dm_bufio] > : verity_prefetch_io+0x3c/0x180 [dm_verity] > : process_one_work+0x246/0x530 > : worker_thread+0x47/0x3e0 > : kthread+0xd2/0x100 > : ret_from_fork+0x19/0x24 > : > : \x0astack backtrace: > : CPU: 1 PID: 2488 Comm: kworker/u16:6 Not tainted 5.19.0-rc8+ #767 > : Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020 > : Workqueue: kverityd verity_prefetch_io [dm_verity] > : Call Trace: > : dump_stack_lvl+0x68/0x98 > : dump_stack+0xd/0x10 > : print_bad_irq_dependency.cold+0x1f2/0x1f8 > : __lock_acquire+0x2522/0x2840 > : ? __this_cpu_preempt_check+0xf/0x11 > : lock_acquire+0xb2/0x2b0 > : ? adjust_total_allocated+0x95/0x120 [dm_bufio] > : ? __this_cpu_preempt_check+0xf/0x11 > : _raw_spin_lock+0x28/0x70 > : ? adjust_total_allocated+0x95/0x120 [dm_bufio] > : adjust_total_allocated+0x95/0x120 [dm_bufio] > : __link_buffer+0xb2/0xf0 [dm_bufio] > : ? alloc_buffer+0xc3/0x100 [dm_bufio] > : __bufio_new+0x20b/0x2b0 [dm_bufio] > : dm_bufio_prefetch+0x90/0x1f0 [dm_bufio] > : verity_prefetch_io+0x3c/0x180 [dm_verity] > : process_one_work+0x246/0x530 > : ? 0xc1000000 > : worker_thread+0x47/0x3e0 > : kthread+0xd2/0x100 > : ? process_one_work+0x530/0x530 > : ? kthread_complete_and_exit+0x20/0x20 > : ret_from_fork+0x19/0x24 > > > m. > > -- > dm-devel mailing list > dm-devel@xxxxxxxxxx > https://listman.redhat.com/mailman/listinfo/dm-devel > -- dm-devel mailing list dm-devel@xxxxxxxxxx https://listman.redhat.com/mailman/listinfo/dm-devel