On Wed, Aug 12, 2020 at 10:06:50AM +0200, Marco Elver wrote: > On Tue, Aug 11, 2020 at 10:17PM +0200, peterz@xxxxxxxxxxxxx wrote: > > On Tue, Aug 11, 2020 at 11:46:51AM +0200, peterz@xxxxxxxxxxxxx wrote: > > > > > So let me once again see if I can't find a better solution for this all. > > > Clearly it needs one :/ > > > > So the below boots without triggering the debug code from Marco -- it > > should allow nesting local_irq_save/restore under raw_local_irq_*(). > > > > I tried unconditional counting, but there's some _reallly_ wonky / > > asymmetric code that wrecks that and I've not been able to come up with > > anything useful. > > > > This one starts counting when local_irq_save() finds it didn't disable > > IRQs while lockdep though it did. At that point, local_irq_restore() > > will decrement and enable things again when it reaches 0. > > > > This assumes local_irq_save()/local_irq_restore() are nested sane, which > > is mostly true. > > > > This leaves #PF, which I fixed in these other patches, but I realized it > > needs fixing for all architectures :-( No bright ideas there yet. > > > > --- > > arch/x86/entry/thunk_32.S | 5 ---- > > include/linux/irqflags.h | 45 +++++++++++++++++++------------- > > init/main.c | 16 ++++++++++++ > > kernel/locking/lockdep.c | 58 +++++++++++++++++++++++++++++++++++++++++ > > kernel/trace/trace_preemptirq.c | 33 +++++++++++++++++++++++ > > 5 files changed, 134 insertions(+), 23 deletions(-) > > Testing this again with syzkaller produced some new reports: > > BUG: stack guard page was hit in error_entry > BUG: stack guard page was hit in exc_int3 > PANIC: double fault in error_entry > PANIC: double fault in exc_int3 > > Most of them have corrupted reports, but this one might be useful: > > BUG: stack guard page was hit at 000000001fab0982 (stack is 00000000063f33dc..00000000bf04b0d8) > BUG: stack guard page was hit at 00000000ca97ac69 (stack is 00000000af3e6c84..000000001597e1bf) > kernel stack overflow (double-fault): 0000 [#1] PREEMPT SMP > CPU: 1 PID: 4709 Comm: kworker/1:1H Not tainted 5.8.0+ #5 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014 > Workqueue: events_highpri snd_vmidi_output_work > RIP: 0010:exc_int3+0x5/0xf0 arch/x86/kernel/traps.c:636 > Code: c9 85 4d 89 e8 31 c0 e8 a9 7d 68 fd e9 90 fe ff ff e8 0f 35 00 00 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 55 53 48 89 fb <e8> 76 0e 00 00 85 c0 74 03 5b 5d c3 f6 83 88 00 00 00 03 74 7e 48 > RSP: 0018:ffffc90008114000 EFLAGS: 00010083 > RAX: 0000000084e00e17 RBX: ffffc90008114018 RCX: ffffffff84e00e17 > RDX: 0000000000000000 RSI: ffffffff84e00a39 RDI: ffffc90008114018 > RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 > R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 > R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 > FS: 0000000000000000(0000) GS:ffff88807dc80000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: ffffc90008113ff8 CR3: 000000002dae4006 CR4: 0000000000770ee0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > PKRU: 00000000 > Call Trace: > asm_exc_int3+0x31/0x40 arch/x86/include/asm/idtentry.h:537 > RIP: 0010:arch_static_branch include/trace/events/preemptirq.h:40 [inline] > RIP: 0010:static_key_false include/linux/jump_label.h:200 [inline] > RIP: 0010:trace_irq_enable_rcuidle+0xd/0x120 include/trace/events/preemptirq.h:40 > Code: 24 08 48 89 df e8 43 8d ef ff 48 89 df 5b e9 4a 2e 99 03 66 2e 0f 1f 84 00 00 00 00 00 55 41 56 53 48 89 fb e8 84 1a fd ff cc <1f> 44 00 00 5b 41 5e 5d c3 65 8b 05 ab 74 c3 7e 89 c0 31 f6 48 0f > RSP: 0018:ffffc900081140f8 EFLAGS: 00000093 > RAX: ffffffff813d9e8c RBX: ffffffff81314dd3 RCX: ffff888076ce6000 > RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff81314dd3 > RBP: 0000000000000000 R08: ffffffff813da3d4 R09: 0000000000000001 > R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 > R13: 0000000000000082 R14: 0000000000000000 R15: ffff888076ce6000 > trace_hardirqs_restore+0x59/0x80 kernel/trace/trace_preemptirq.c:106 > rcu_irq_enter_irqson+0x43/0x70 kernel/rcu/tree.c:1074 > trace_irq_enable_rcuidle+0x87/0x120 include/trace/events/preemptirq.h:40 > trace_hardirqs_restore+0x59/0x80 kernel/trace/trace_preemptirq.c:106 > rcu_irq_enter_irqson+0x43/0x70 kernel/rcu/tree.c:1074 > trace_irq_enable_rcuidle+0x87/0x120 include/trace/events/preemptirq.h:40 > trace_hardirqs_restore+0x59/0x80 kernel/trace/trace_preemptirq.c:106 > rcu_irq_enter_irqson+0x43/0x70 kernel/rcu/tree.c:1074 > trace_irq_enable_rcuidle+0x87/0x120 include/trace/events/preemptirq.h:40 > trace_hardirqs_restore+0x59/0x80 kernel/trace/trace_preemptirq.c:106 > rcu_irq_enter_irqson+0x43/0x70 kernel/rcu/tree.c:1074 > > <... repeated many many times ...> > > trace_irq_enable_rcuidle+0x87/0x120 include/trace/events/preemptirq.h:40 > trace_hardirqs_restore+0x59/0x80 kernel/trace/trace_preemptirq.c:106 > rcu_irq_enter_irqson+0x43/0x70 kernel/rcu/tree.c:1074 > Lost 500 message(s)! > BUG: stack guard page was hit at 00000000cab483ba (stack is 00000000b1442365..00000000c26f9ad3) > BUG: stack guard page was hit at 00000000318ff8d8 (stack is 00000000fd87d656..0000000058100136) > ---[ end trace 4157e0bb4a65941a ]--- Wheee... recursion! Let me try and see if I can make something of that. _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization