On Fri, Sep 16, 2022 at 02:28:08PM +0200, Uladzislau Rezki wrote: > On Fri, Sep 16, 2022 at 02:46:39AM -0700, Kees Cook wrote: > > On Fri, Sep 16, 2022 at 09:38:33AM +0100, Matthew Wilcox wrote: > > > On Thu, Sep 15, 2022 at 05:59:56PM -0600, Yu Zhao wrote: > > > > I think this is a manifest of the lockdep warning I reported a couple > > > > of weeks ago: > > > > https://lore.kernel.org/r/CAOUHufaPshtKrTWOz7T7QFYUNVGFm0JBjvM700Nhf9qEL9b3EQ@xxxxxxxxxxxxxx/ > > > > > > That would certainly match the symptoms. > > > > > > Turning vmap_lock into an NMI-safe lock would be bad. I don't even know > > > if we have primitives for that (it's not like you can disable an NMI > > > ...) > > > > > > I don't quite have time to write a patch right now. Perhaps something > > > like: > > > > > > struct vmap_area *find_vmap_area_nmi(unsigned long addr) > > > { > > > struct vmap_area *va; > > > > > > if (spin_trylock(&vmap_area_lock)) > > > return NULL; > > > va = __find_vmap_area(addr, &vmap_area_root); > > > spin_unlock(&vmap_area_lock); > > > > > > return va; > > > } > > > > > > and then call find_vmap_area_nmi() in check_heap_object(). I may have > > > the polarity of the return value of spin_trylock() incorrect. > > > > I think we'll need something slightly tweaked, since this would > > return NULL under any contention (and a NULL return is fatal in > > check_heap_object()). It seems like we need to explicitly check > > for being in nmi context in check_heap_object() to deal with it? > > Like this (only build tested): > > > > > > diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h > > index 096d48aa3437..c8a00f181a11 100644 > > --- a/include/linux/vmalloc.h > > +++ b/include/linux/vmalloc.h > > @@ -216,6 +216,7 @@ void free_vm_area(struct vm_struct *area); > > extern struct vm_struct *remove_vm_area(const void *addr); > > extern struct vm_struct *find_vm_area(const void *addr); > > struct vmap_area *find_vmap_area(unsigned long addr); > > +struct vmap_area *find_vmap_area_try(unsigned long addr); > > > > static inline bool is_vm_area_hugepages(const void *addr) > > { > > diff --git a/mm/usercopy.c b/mm/usercopy.c > > index c1ee15a98633..9f943c29e7ec 100644 > > --- a/mm/usercopy.c > > +++ b/mm/usercopy.c > > @@ -173,7 +173,16 @@ static inline void check_heap_object(const void *ptr, unsigned long n, > > } > > > > if (is_vmalloc_addr(ptr)) { > > - struct vmap_area *area = find_vmap_area(addr); > > + struct vmap_area *area; > > + > > + if (!in_nmi()) { > > + area = find_vmap_area(addr); > > + } else { > > + area = find_vmap_area_try(addr); > > + /* Give up under NMI to avoid deadlocks. */ > > + if (!area) > > + return; > > + } > > > > if (!area) > > usercopy_abort("vmalloc", "no area", to_user, 0, n); > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > > index dd6cdb201195..f14f1902c2f6 100644 > > --- a/mm/vmalloc.c > > +++ b/mm/vmalloc.c > > @@ -1840,6 +1840,17 @@ struct vmap_area *find_vmap_area(unsigned long addr) > > return va; > > } > > > > +struct vmap_area *find_vmap_area_try(unsigned long addr) > > +{ > > + struct vmap_area *va = NULL; > > + > > + if (spin_trylock(&vmap_area_lock)) { > > + va = __find_vmap_area(addr, &vmap_area_root); > > + spin_unlock(&vmap_area_lock); > > + } > > + return va; > > +} > > + > > /*** Per cpu kva allocator ***/ > > > > /* > > > OK. The problem is about using find_vmap_area() from the IRQ context. Indeed > it can be dead-locked. It is not supposed to be used there. But if you want > then we should have a helper. > > Please note that it might be a regular IRQ also so it is not limited to NMI > context only, because somebody could decide later to use it from a regular > IRQ. > > IMHO, it makes sense to make use of in_interrupt() helper instead so we > cover here a hw-IRQ context including NMI one. It also would be aligned > with deferred vfreeing: > > <snip> > tatic void __vfree(const void *addr) > { > if (unlikely(in_interrupt())) > __vfree_deferred(addr); > else > __vunmap(addr, 1); > } > <snip> > > so we handle here not only NMI scenario. I think we should align. > Another thing that i should mention is, using sleepable locks(it is for PREEMPT_RT) is not allowed in any atomic. So for PREEMPT_RT point of view it is broken. -- Uladzislau Rezki