On Wed, May 15, 2019 at 11:12 AM Pavel Tatashin <pasha.tatashin@xxxxxxxxxx> wrote: > > > Hi Pavel, > > > > I am working on adding this sort of a workflow into a new daxctl command > > (daxctl-reconfigure-device)- this will allow changing the 'mode' of a > > dax device to kmem, online the resulting memory, and with your patches, > > also attempt to offline the memory, and change back to device-dax. > > > > In running with these patches, and testing the offlining part, I ran > > into the following lockdep below. > > > > This is with just these three patches on top of -rc7. > > > > > > [ +0.004886] ====================================================== > > [ +0.001576] WARNING: possible circular locking dependency detected > > [ +0.001506] 5.1.0-rc7+ #13 Tainted: G O > > [ +0.000929] ------------------------------------------------------ > > [ +0.000708] daxctl/22950 is trying to acquire lock: > > [ +0.000548] 00000000f4d397f7 (kn->count#424){++++}, at: kernfs_remove_by_name_ns+0x40/0x80 > > [ +0.000922] > > but task is already holding lock: > > [ +0.000657] 000000002aa52a9f (mem_sysfs_mutex){+.+.}, at: unregister_memory_section+0x22/0xa0 > > I have studied this issue, and now have a clear understanding why it > happens, I am not yet sure how to fix it, so suggestions are welcomed > :) I would think that ACPI hotplug would have a similar problem, but it does this: acpi_unbind_memory_blocks(info); __remove_memory(nid, info->start_addr, info->length); I wonder if that ordering prevents going too deep into the device_unregister() call stack that you highlighted below. > > Here is the problem: > > When we offline pages we have the following call stack: > > # echo offline > /sys/devices/system/memory/memory8/state > ksys_write > vfs_write > __vfs_write > kernfs_fop_write > kernfs_get_active > lock_acquire kn->count#122 (lock for > "memory8/state" kn) > sysfs_kf_write > dev_attr_store > state_store > device_offline > memory_subsys_offline > memory_block_action > offline_pages > __offline_pages > percpu_down_write > down_write > lock_acquire mem_hotplug_lock.rw_sem > > When we unbind dax0.0 we have the following stack: > # echo dax0.0 > /sys/bus/dax/drivers/kmem/unbind > drv_attr_store > unbind_store > device_driver_detach > device_release_driver_internal > dev_dax_kmem_remove > remove_memory device_hotplug_lock > try_remove_memory mem_hotplug_lock.rw_sem > arch_remove_memory > __remove_pages > __remove_section > unregister_memory_section > remove_memory_section mem_sysfs_mutex > unregister_memory > device_unregister > device_del > device_remove_attrs > sysfs_remove_groups > sysfs_remove_group > remove_files > kernfs_remove_by_name > kernfs_remove_by_name_ns > __kernfs_remove kn->count#122 > > So, lockdep found the ordering issue with the above two stacks: > > 1. kn->count#122 -> mem_hotplug_lock.rw_sem > 2. mem_hotplug_lock.rw_sem -> kn->count#122