On Thu 25-07-19 15:05:02, David Hildenbrand wrote: > On 25.07.19 14:56, Michal Hocko wrote: > > On Wed 24-07-19 16:30:17, David Hildenbrand wrote: > >> We end up calling __add_memory() without the device hotplug lock held. > >> (I used a local patch to assert in __add_memory() that the > >> device_hotplug_lock is held - I might upstream that as well soon) > >> > >> [ 26.771684] create_memory_block_devices+0xa4/0x140 > >> [ 26.772952] add_memory_resource+0xde/0x200 > >> [ 26.773987] __add_memory+0x6e/0xa0 > >> [ 26.775161] acpi_memory_device_add+0x149/0x2b0 > >> [ 26.776263] acpi_bus_attach+0xf1/0x1f0 > >> [ 26.777247] acpi_bus_attach+0x66/0x1f0 > >> [ 26.778268] acpi_bus_attach+0x66/0x1f0 > >> [ 26.779073] acpi_bus_attach+0x66/0x1f0 > >> [ 26.780143] acpi_bus_scan+0x3e/0x90 > >> [ 26.780844] acpi_scan_init+0x109/0x257 > >> [ 26.781638] acpi_init+0x2ab/0x30d > >> [ 26.782248] do_one_initcall+0x58/0x2cf > >> [ 26.783181] kernel_init_freeable+0x1bd/0x247 > >> [ 26.784345] kernel_init+0x5/0xf1 > >> [ 26.785314] ret_from_fork+0x3a/0x50 > >> > >> So perform the locking just like in acpi_device_hotplug(). > > > > While playing with the device_hotplug_lock, can we actually document > > what it is protecting please? I have a bad feeling that we are adding > > this lock just because some other code path does rather than with a good > > idea why it is needed. This patch just confirms that. What exactly does > > the lock protect from here in an early boot stage. > > We have plenty of documentation already > > mm/memory_hotplug.c > > git grep -C5 device_hotplug mm/memory_hotplug.c > > Also see > > Documentation/core-api/memory-hotplug.rst OK, fair enough. I was more pointing to a documentation right there where the lock is declared because that is the place where people usually check for documentation. The core-api documentation looks quite nice. And based on that doc it seems that this patch is actually not needed because neither the online/offline or cpu hotplug should be possible that early unless I am missing something. > Regarding the early stage: primarily lockdep as I mentioned. Could you add a lockdep splat that would be fixed by this patch to the changelog for reference? -- Michal Hocko SUSE Labs