On Wed, Oct 07, 2020 at 04:41PM +0200, Marco Elver wrote: > On Wed, 7 Oct 2020 at 16:15, Jann Horn <jannh@xxxxxxxxxx> wrote: [...] > > > > > + return false; > > > > > + > > > > > + if (protect) > > > > > + set_pte(pte, __pte(pte_val(*pte) & ~_PAGE_PRESENT)); > > > > > + else > > > > > + set_pte(pte, __pte(pte_val(*pte) | _PAGE_PRESENT)); > > > > > > > > Hmm... do we have this helper (instead of using the existing helpers > > > > for modifying memory permissions) to work around the allocation out of > > > > the data section? > > > > > > I just played around with using the set_memory.c functions, to remind > > > myself why this didn't work. I experimented with using > > > set_memory_{np,p}() functions; set_memory_p() isn't implemented, but > > > is easily added (which I did for below experiment). However, this > > > didn't quite work: > > [...] > > > For one, smp_call_function_many_cond() doesn't want to be called with > > > interrupts disabled, and we may very well get a KFENCE allocation or > > > page fault with interrupts disabled / within interrupts. > > > > > > Therefore, to be safe, we should avoid IPIs. > > > > set_direct_map_invalid_noflush() does that, too, I think? And that's > > already implemented for both arm64 and x86. > > Sure, that works. > > We still want the flush_tlb_one_kernel(), at least so the local CPU's > TLB is flushed. Nope, sorry, set_direct_map_invalid_noflush() does not work -- this results in potential deadlock. ================================ WARNING: inconsistent lock state 5.9.0-rc4+ #2 Not tainted -------------------------------- inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. ksoftirqd/1/16 [HC0[0]:SC1[1]:HE1:SE0] takes: ffffffff89fcf9b8 (cpa_lock){+.?.}-{2:2}, at: spin_lock include/linux/spinlock.h:354 [inline] ffffffff89fcf9b8 (cpa_lock){+.?.}-{2:2}, at: __change_page_attr_set_clr+0x1b0/0x2510 arch/x86/mm/pat/set_memory.c:1658 {SOFTIRQ-ON-W} state was registered at: lock_acquire+0x1f3/0xae0 kernel/locking/lockdep.c:5006 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline] _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:151 spin_lock include/linux/spinlock.h:354 [inline] __change_page_attr_set_clr+0x1b0/0x2510 arch/x86/mm/pat/set_memory.c:1658 change_page_attr_set_clr+0x333/0x500 arch/x86/mm/pat/set_memory.c:1752 change_page_attr_set arch/x86/mm/pat/set_memory.c:1782 [inline] set_memory_nx+0xb2/0x110 arch/x86/mm/pat/set_memory.c:1930 free_init_pages+0x73/0xc0 arch/x86/mm/init.c:876 alternative_instructions+0x155/0x1a4 arch/x86/kernel/alternative.c:738 check_bugs+0x1bd0/0x1c77 arch/x86/kernel/cpu/bugs.c:140 start_kernel+0x486/0x4b6 init/main.c:1042 secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:243 irq event stamp: 14564 hardirqs last enabled at (14564): [<ffffffff8828cadf>] __raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:160 [inline] hardirqs last enabled at (14564): [<ffffffff8828cadf>] _raw_spin_unlock_irqrestore+0x6f/0x90 kernel/locking/spinlock.c:191 hardirqs last disabled at (14563): [<ffffffff8828d239>] __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:108 [inline] hardirqs last disabled at (14563): [<ffffffff8828d239>] _raw_spin_lock_irqsave+0xa9/0xce kernel/locking/spinlock.c:159 softirqs last enabled at (14486): [<ffffffff8147fcff>] run_ksoftirqd kernel/softirq.c:652 [inline] softirqs last enabled at (14486): [<ffffffff8147fcff>] run_ksoftirqd+0xcf/0x170 kernel/softirq.c:644 softirqs last disabled at (14491): [<ffffffff8147fcff>] run_ksoftirqd kernel/softirq.c:652 [inline] softirqs last disabled at (14491): [<ffffffff8147fcff>] run_ksoftirqd+0xcf/0x170 kernel/softirq.c:644 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(cpa_lock); <Interrupt> lock(cpa_lock); *** DEADLOCK *** 1 lock held by ksoftirqd/1/16: #0: ffffffff8a067e20 (rcu_callback){....}-{0:0}, at: rcu_do_batch kernel/rcu/tree.c:2418 [inline] #0: ffffffff8a067e20 (rcu_callback){....}-{0:0}, at: rcu_core+0x55d/0x1130 kernel/rcu/tree.c:2656 stack backtrace: CPU: 1 PID: 16 Comm: ksoftirqd/1 Not tainted 5.9.0-rc4+ #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x198/0x1fd lib/dump_stack.c:118 print_usage_bug kernel/locking/lockdep.c:3350 [inline] valid_state kernel/locking/lockdep.c:3361 [inline] mark_lock_irq kernel/locking/lockdep.c:3575 [inline] mark_lock.cold+0x12/0x17 kernel/locking/lockdep.c:4006 mark_usage kernel/locking/lockdep.c:3905 [inline] __lock_acquire+0x1159/0x5780 kernel/locking/lockdep.c:4380 lock_acquire+0x1f3/0xae0 kernel/locking/lockdep.c:5006 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline] _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:151 spin_lock include/linux/spinlock.h:354 [inline] __change_page_attr_set_clr+0x1b0/0x2510 arch/x86/mm/pat/set_memory.c:1658 __set_pages_np arch/x86/mm/pat/set_memory.c:2184 [inline] set_direct_map_invalid_noflush+0xd2/0x110 arch/x86/mm/pat/set_memory.c:2189 kfence_protect_page arch/x86/include/asm/kfence.h:62 [inline] kfence_protect+0x10e/0x120 mm/kfence/core.c:124 kfence_guarded_free+0x380/0x880 mm/kfence/core.c:375 rcu_do_batch kernel/rcu/tree.c:2428 [inline] rcu_core+0x5ca/0x1130 kernel/rcu/tree.c:2656 __do_softirq+0x1f8/0xb23 kernel/softirq.c:298 run_ksoftirqd kernel/softirq.c:652 [inline] run_ksoftirqd+0xcf/0x170 kernel/softirq.c:644 smpboot_thread_fn+0x655/0x9e0 kernel/smpboot.c:165 kthread+0x3b5/0x4a0 kernel/kthread.c:292 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294