Re: [Bug 216489] New: Machine freezes due to memory lock

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Sep 16, 2022 at 03:15:05PM +0100, Matthew Wilcox wrote:
> On Fri, Sep 16, 2022 at 02:46:39AM -0700, Kees Cook wrote:
> > On Fri, Sep 16, 2022 at 09:38:33AM +0100, Matthew Wilcox wrote:
> > > On Thu, Sep 15, 2022 at 05:59:56PM -0600, Yu Zhao wrote:
> > > > I think this is a manifest of the lockdep warning I reported a couple
> > > > of weeks ago:
> > > > https://lore.kernel.org/r/CAOUHufaPshtKrTWOz7T7QFYUNVGFm0JBjvM700Nhf9qEL9b3EQ@xxxxxxxxxxxxxx/
> > > 
> > > That would certainly match the symptoms.
> > > 
> > > Turning vmap_lock into an NMI-safe lock would be bad.  I don't even know
> > > if we have primitives for that (it's not like you can disable an NMI
> > > ...)
> > > 
> > > I don't quite have time to write a patch right now.  Perhaps something
> > > like:
> > > 
> > > struct vmap_area *find_vmap_area_nmi(unsigned long addr)
> > > {
> > >         struct vmap_area *va;
> > > 
> > >         if (spin_trylock(&vmap_area_lock))
> > > 		return NULL;
> > >         va = __find_vmap_area(addr, &vmap_area_root);
> > >         spin_unlock(&vmap_area_lock);
> > > 
> > >         return va;
> > > }
> > > 
> > > and then call find_vmap_area_nmi() in check_heap_object().  I may have
> > > the polarity of the return value of spin_trylock() incorrect.
> > 
> > I think we'll need something slightly tweaked, since this would
> > return NULL under any contention (and a NULL return is fatal in
> > check_heap_object()). It seems like we need to explicitly check
> > for being in nmi context in check_heap_object() to deal with it?
> > Like this (only build tested):
> 
> Right, and Ulad is right about it beig callable from any context.  I think
> the longterm solution is to make the vmap_area_root tree walkable under
> RCU protection.
> 
> For now, let's have a distinct return code (ERR_PTR(-EAGAIN), perhaps?) to
> indicate that we've hit contention.  It generally won't matter if we
> hit it in process context because hardening doesn't have to be 100%
> reliable to be useful.
> 
> Erm ... so what prevents this race:
> 
> CPU 0					CPU 1
> copy_to_user()
> check_heap_object()
> area = find_vmap_area(addr)
> 					__purge_vmap_area_lazy()
> 					merge_or_add_vmap_area_augment()
> 					__merge_or_add_vmap_area()
> 					kmem_cache_free(vmap_area_cachep, va);
>
Sounds like it can happen. I think it is a good argument to switch to
the RCU usage here for safe access to va after the lock is released.
So i can think about it and put it as task to my todo list. Since it
is not urgent so far it is OK to wait for a splat. But it might never
happens :)

--
Uladzislau Rezki




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux