Re: [PATCH] mm/kmemleak: Don't hold kmemleak_lock when calling printk()

Waiman Long <longman@xxxxxxxxxx> · Thu, 29 Feb 2024 10:55:38 -0500

    On 2/29/24 10:25, Catalin Marinas
      wrote:

      On Wed, Feb 28, 2024 at 02:14:44PM -0500, Waiman Long wrote:

        When some error conditions happen (like OOM), some kmemleak functions
call printk() to dump out some useful debugging information while holding
the kmemleak_lock. This may cause deadlock as the printk() function
may need to allocate additional memory leading to a create_object()
call acquiring kmemleak_lock again.

Fix this deadlock issue by making sure that printk() is only called
after releasing the kmemleak_lock.

      I can't say I'm familiar with the printk() code but I always thought it
uses some ring buffers as it can be called from all kind of contexts and
allocation is not guaranteed.

If printk() ends up taking kmemleak_lock through the slab allocator, I
wonder whether we have bigger problems. The lock order is always
kmemleak_lock -> object->lock but if printk() triggers a callback into
kmemleak, we can also get object->lock -> kmemleak_lock ordering, so
another potential deadlock.

    object->lock is per object whereas kmemleak_lock is global.
      When taking object->lock and doing a data dump leading to a
      call that takes the kmemlock, it is highly unlikely the it will
      need to take that particular object->lock again. I do agree
      that lockdep may still warn about it if that happens as all the
      object->lock's are likely to be treated to be in the same
      class.
    I should probably clarify in the change log that the lockdep
      splat is actually, 

    [ 3991.452558] Chain exists of:
[ 3991.452559] console_owner -> &port->lock --> kmemleak_lock

    So if kmemleak calls printk() acquiring either console_owner or port->lock. It may cause deadlock.
    Cheers,
Longman