On 2/29/24 10:25, Catalin Marinas
wrote:
On Wed, Feb 28, 2024 at 02:14:44PM -0500, Waiman Long wrote:When some error conditions happen (like OOM), some kmemleak functions call printk() to dump out some useful debugging information while holding the kmemleak_lock. This may cause deadlock as the printk() function may need to allocate additional memory leading to a create_object() call acquiring kmemleak_lock again. Fix this deadlock issue by making sure that printk() is only called after releasing the kmemleak_lock.I can't say I'm familiar with the printk() code but I always thought it uses some ring buffers as it can be called from all kind of contexts and allocation is not guaranteed. If printk() ends up taking kmemleak_lock through the slab allocator, I wonder whether we have bigger problems. The lock order is always kmemleak_lock -> object->lock but if printk() triggers a callback into kmemleak, we can also get object->lock -> kmemleak_lock ordering, so another potential deadlock.
object->lock is per object whereas kmemleak_lock is global. When taking object->lock and doing a data dump leading to a call that takes the kmemlock, it is highly unlikely the it will need to take that particular object->lock again. I do agree that lockdep may still warn about it if that happens as all the object->lock's are likely to be treated to be in the same class.
I should probably clarify in the change log that the lockdep
splat is actually,
[ 3991.452558] Chain exists of: [ 3991.452559] console_owner -> &port->lock --> kmemleak_lock
So if kmemleak calls printk() acquiring either console_owner or port->lock. It may cause deadlock.
Cheers, Longman