On Thu, 18 Jan 2024 at 12:00, Marco Elver <elver@xxxxxxxxxx> wrote: > > Alexander Potapenko writes in [1]: "For every memory access in the code > instrumented by KMSAN we call kmsan_get_metadata() to obtain the > metadata for the memory being accessed. For virtual memory the metadata > pointers are stored in the corresponding `struct page`, therefore we > need to call virt_to_page() to get them. > > According to the comment in arch/x86/include/asm/page.h, > virt_to_page(kaddr) returns a valid pointer iff virt_addr_valid(kaddr) > is true, so KMSAN needs to call virt_addr_valid() as well. > > To avoid recursion, kmsan_get_metadata() must not call instrumented > code, therefore ./arch/x86/include/asm/kmsan.h forks parts of > arch/x86/mm/physaddr.c to check whether a virtual address is valid or > not. > > But the introduction of rcu_read_lock() to pfn_valid() added > instrumented RCU API calls to virt_to_page_or_null(), which is called by > kmsan_get_metadata(), so there is an infinite recursion now. I do not > think it is correct to stop that recursion by doing > kmsan_enter_runtime()/kmsan_exit_runtime() in kmsan_get_metadata(): that > would prevent instrumented functions called from within the runtime from > tracking the shadow values, which might introduce false positives." > > Fix the issue by switching pfn_valid() to the _sched() variant of > rcu_read_lock/unlock(), which does not require calling into RCU. Given > the critical section in pfn_valid() is very small, this is a reasonable > trade-off (with preemptible RCU). > > KMSAN further needs to be careful to suppress calls into the scheduler, > which would be another source of recursion. This can be done by wrapping > the call to pfn_valid() into preempt_disable/enable_no_resched(). The > downside is that this sacrifices breaking scheduling guarantees; > however, a kernel compiled with KMSAN has already given up any > performance guarantees due to being heavily instrumented. > > Note, KMSAN code already disables tracing via Makefile, and since > mmzone.h is included, it is not necessary to use the notrace variant, > which is generally preferred in all other cases. > > Link: https://lkml.kernel.org/r/20240115184430.2710652-1-glider@xxxxxxxxxx [1] > Reported-by: Alexander Potapenko <glider@xxxxxxxxxx> > Reported-by: syzbot+93a9e8a3dea8d6085e12@xxxxxxxxxxxxxxxxxxxxxxxxx > Signed-off-by: Marco Elver <elver@xxxxxxxxxx> > Cc: Charan Teja Kalla <quic_charante@xxxxxxxxxxx> This might want a: Fixes: 5ec8e8ea8b77 ("mm/sparsemem: fix race in accessing memory_section->usage") For reference which patch introduced the problem. > --- > arch/x86/include/asm/kmsan.h | 17 ++++++++++++++++- > include/linux/mmzone.h | 6 +++--- > 2 files changed, 19 insertions(+), 4 deletions(-) > > diff --git a/arch/x86/include/asm/kmsan.h b/arch/x86/include/asm/kmsan.h > index 8fa6ac0e2d76..d91b37f5b4bb 100644 > --- a/arch/x86/include/asm/kmsan.h > +++ b/arch/x86/include/asm/kmsan.h > @@ -64,6 +64,7 @@ static inline bool kmsan_virt_addr_valid(void *addr) > { > unsigned long x = (unsigned long)addr; > unsigned long y = x - __START_KERNEL_map; > + bool ret; > > /* use the carry flag to determine if x was < __START_KERNEL_map */ > if (unlikely(x > y)) { > @@ -79,7 +80,21 @@ static inline bool kmsan_virt_addr_valid(void *addr) > return false; > } > > - return pfn_valid(x >> PAGE_SHIFT); > + /* > + * pfn_valid() relies on RCU, and may call into the scheduler on exiting > + * the critical section. However, this would result in recursion with > + * KMSAN. Therefore, disable preemption here, and re-enable preemption > + * below while suppressing reschedules to avoid recursion. > + * > + * Note, this sacrifices occasionally breaking scheduling guarantees. > + * Although, a kernel compiled with KMSAN has already given up on any > + * performance guarantees due to being heavily instrumented. > + */ > + preempt_disable(); > + ret = pfn_valid(x >> PAGE_SHIFT); > + preempt_enable_no_resched(); > + > + return ret; > } > > #endif /* !MODULE */ > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h > index 4ed33b127821..a497f189d988 100644 > --- a/include/linux/mmzone.h > +++ b/include/linux/mmzone.h > @@ -2013,9 +2013,9 @@ static inline int pfn_valid(unsigned long pfn) > if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS) > return 0; > ms = __pfn_to_section(pfn); > - rcu_read_lock(); > + rcu_read_lock_sched(); > if (!valid_section(ms)) { > - rcu_read_unlock(); > + rcu_read_unlock_sched(); > return 0; > } > /* > @@ -2023,7 +2023,7 @@ static inline int pfn_valid(unsigned long pfn) > * the entire section-sized span. > */ > ret = early_section(ms) || pfn_section_valid(ms, pfn); > - rcu_read_unlock(); > + rcu_read_unlock_sched(); > > return ret; > } > -- > 2.43.0.381.gb435a96ce8-goog >