On 4/27/20 8:13 PM, Qian Cai wrote:
On Apr 27, 2020, at 7:56 PM, Waiman Long <longman@xxxxxxxxxx> wrote:
A lockdep splat is observed by echoing "1" to the shrink sysfs file
and then shutting down the system:
[ 167.473392] Chain exists of:
[ 167.473392] kn->count#279 --> mem_hotplug_lock.rw_sem --> slab_mutex
[ 167.473392]
[ 167.484323] Possible unsafe locking scenario:
[ 167.484323]
[ 167.490273] CPU0 CPU1
[ 167.494825] ---- ----
[ 167.499376] lock(slab_mutex);
[ 167.502530] lock(mem_hotplug_lock.rw_sem);
[ 167.509356] lock(slab_mutex);
[ 167.515044] lock(kn->count#279);
[ 167.518462]
[ 167.518462] *** DEADLOCK ***
It is because of the get_online_cpus() and get_online_mems() calls in
kmem_cache_shrink() invoked via the shrink sysfs file. To fix that, we
have to use trylock to get the memory and cpu hotplug read locks. Since
hotplug events are rare, it should be fine to refuse a kmem caches
shrink operation when some hotplug events are in progress.
I don’t understand how trylock could prevent a splat. The fundamental issue is that in sysfs slab store case, the locking order (once trylock succeed) is,
kn->count —> cpu/memory_hotplug
But we have the existing reverse chain everywhere.
cpu/memory_hotplug —> slab_mutex —> kn->count
The sequence that was prevented by this patch is "kn->count -->
mem_hotplug_lock.rwsem". This sequence isn't directly in the splat. Once
this link is broken, the 3-lock circular loop cannot be formed. Maybe I
should modify the commit log to make this point more clear.
Cheers,
Longman