On Fri, Jan 28, 2022 at 04:33:44PM +0100, Jason A. Donenfeld wrote: > From: Andy Lutomirski <luto@xxxxxxxxxx> > > We don't need spinlocks to protect batched entropy -- all we need > is a little bit of care. This should fix up the following splat that > Jonathan received with a PROVE_LOCKING=y/PROVE_RAW_LOCK_NESTING=y > kernel: > > [ 2.500000] [ BUG: Invalid wait context ] > [ 2.500000] 5.17.0-rc1 #563 Not tainted > [ 2.500000] ----------------------------- > [ 2.500000] swapper/1 is trying to lock: > [ 2.500000] c0b0e9cc (batched_entropy_u32.lock){....}-{3:3}, at: invalidate_batched_entropy+0x18/0x4c > [ 2.500000] other info that might help us debug this: > [ 2.500000] context-{2:2} > [ 2.500000] 3 locks held by swapper/1: > [ 2.500000] #0: c0ae86ac (event_mutex){+.+.}-{4:4}, at: event_trace_init+0x4c/0xd8 > [ 2.500000] #1: c0ae81b8 (trace_event_sem){+.+.}-{4:4}, at: event_trace_init+0x68/0xd8 > [ 2.500000] #2: c19b05cc (&sb->s_type->i_mutex_key#2){+.+.}-{4:4}, at: start_creating+0x40/0xc4 > [ 2.500000] stack backtrace: > [ 2.500000] CPU: 0 PID: 1 Comm: swapper Not tainted 5.17.0-rc1 #563 > [ 2.500000] Hardware name: WPCM450 chip > [ 2.500000] [<c00100a8>] (unwind_backtrace) from [<c000db2c>] (show_stack+0x10/0x14) > [ 2.500000] [<c000db2c>] (show_stack) from [<c0054eec>] (__lock_acquire+0x3f0/0x189c) > [ 2.500000] [<c0054eec>] (__lock_acquire) from [<c0054478>] (lock_acquire+0x2b8/0x354) > [ 2.500000] [<c0054478>] (lock_acquire) from [<c0568028>] (_raw_spin_lock_irqsave+0x60/0x74) > [ 2.500000] [<c0568028>] (_raw_spin_lock_irqsave) from [<c030b6f4>] (invalidate_batched_entropy+0x18/0x4c) > [ 2.500000] [<c030b6f4>] (invalidate_batched_entropy) from [<c030e7fc>] (crng_fast_load+0xf0/0x110) > [ 2.500000] [<c030e7fc>] (crng_fast_load) from [<c030e954>] (add_interrupt_randomness+0x138/0x200) > [ 2.500000] [<c030e954>] (add_interrupt_randomness) from [<c0061b34>] (handle_irq_event_percpu+0x18/0x38) > [ 2.500000] [<c0061b34>] (handle_irq_event_percpu) from [<c0061b8c>] (handle_irq_event+0x38/0x5c) > [ 2.500000] [<c0061b8c>] (handle_irq_event) from [<c0065b28>] (handle_fasteoi_irq+0x9c/0x114) > [ 2.500000] [<c0065b28>] (handle_fasteoi_irq) from [<c0061178>] (handle_irq_desc+0x24/0x34) > [ 2.500000] [<c0061178>] (handle_irq_desc) from [<c056214c>] (generic_handle_arch_irq+0x28/0x3c) > [ 2.500000] [<c056214c>] (generic_handle_arch_irq) from [<c0008eb4>] (__irq_svc+0x54/0x80) > [ 2.500000] Exception stack(0xc1485d48 to 0xc1485d90) > [ 2.500000] 5d40: 9780e804 00000001 c09413d4 200000d3 60000053 c016af54 > [ 2.500000] 5d60: 00000000 c0afa5b8 c14194e0 c19a1d48 c0789ce0 00000000 c1490480 c1485d98 > [ 2.500000] 5d80: c0168970 c0168984 20000053 ffffffff > [ 2.500000] [<c0008eb4>] (__irq_svc) from [<c0168984>] (read_seqbegin.constprop.0+0x6c/0x90) > [ 2.500000] [<c0168984>] (read_seqbegin.constprop.0) from [<c016af54>] (d_lookup+0x14/0x40) > [ 2.500000] [<c016af54>] (d_lookup) from [<c015cecc>] (lookup_dcache+0x18/0x50) > [ 2.500000] [<c015cecc>] (lookup_dcache) from [<c015d868>] (lookup_one_len+0x90/0xe0) > [ 2.500000] [<c015d868>] (lookup_one_len) from [<c01e33e4>] (start_creating+0x68/0xc4) > [ 2.500000] [<c01e33e4>] (start_creating) from [<c01e398c>] (tracefs_create_file+0x30/0x11c) > [ 2.500000] [<c01e398c>] (tracefs_create_file) from [<c00c42f8>] (trace_create_file+0x14/0x38) > [ 2.500000] [<c00c42f8>] (trace_create_file) from [<c00cc854>] (event_create_dir+0x310/0x420) > [ 2.500000] [<c00cc854>] (event_create_dir) from [<c00cc9d8>] (__trace_early_add_event_dirs+0x28/0x50) > [ 2.500000] [<c00cc9d8>] (__trace_early_add_event_dirs) from [<c07c8d64>] (event_trace_init+0x70/0xd8) > [ 2.500000] [<c07c8d64>] (event_trace_init) from [<c07c8560>] (tracer_init_tracefs+0x14/0x284) > [ 2.500000] [<c07c8560>] (tracer_init_tracefs) from [<c000a330>] (do_one_initcall+0xdc/0x288) > [ 2.500000] [<c000a330>] (do_one_initcall) from [<c07bd1e8>] (kernel_init_freeable+0x1c4/0x20c) > [ 2.500000] [<c07bd1e8>] (kernel_init_freeable) from [<c05629c0>] (kernel_init+0x10/0x110) > [ 2.500000] [<c05629c0>] (kernel_init) from [<c00084f8>] (ret_from_fork+0x14/0x3c) > [ 2.500000] Exception stack(0xc1485fb0 to 0xc1485ff8) > [ 2.500000] 5fa0: 00000000 00000000 00000000 00000000 > [ 2.500000] 5fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 > [ 2.500000] 5fe0: 00000000 00000000 00000000 00000000 00000013 00000000 > > Signed-off-by: Andy Lutomirski <luto@xxxxxxxxxx> > [Jason: I extracted this from a larger in-progress series of Andy's that > also unifies the two batches into one and does other performance > things. Since that's still under development, but because we need this > part to fix the CONFIG_PROVE_RAW_LOCK_NESTING issue, I've extracted it > out and applied it to the current setup. This will also make it easier > to backport to old kernels that also need the fix. I've also amended > Andy's original commit message.] > Reported-by: Jonathan Neuschäfer <j.neuschaefer@xxxxxxx> > Link: https://lore.kernel.org/lkml/YfMa0QgsjCVdRAvJ@latitude/ > Fixes: b7d5dc21072c ("random: add a spinlock_t to struct batched_entropy") > Cc: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx> > Cc: stable@xxxxxxxxxxxxxxx > Signed-off-by: Jason A. Donenfeld <Jason@xxxxxxxxx> > --- > Andy - could you take a look at this and let me know if it's still > correct after I've ported it out of your series and into a standalone > thing here? I'd prefer to hold off on moving forward on this until I > receive our green light. I'm also still a bit uncertain about your NB: > comment regarding the acceptable race. If you could elaborate on that > argument, it might save me a few cycles with my thinking cap on. > > drivers/char/random.c | 57 ++++++++++++++++++++++++------------------- > 1 file changed, 32 insertions(+), 25 deletions(-) FWIW, this does fix the splat on my machine. Tested-by: Jonathan Neuschäfer <j.neuschaefer@xxxxxxx> Thanks everyone, Jonathan
Attachment:
signature.asc
Description: PGP signature