On 25.07.19 23:23, Rafael J. Wysocki wrote: > On Thu, Jul 25, 2019 at 10:49 PM David Hildenbrand <david@xxxxxxxxxx> wrote: >> >> On 25.07.19 21:19, Michal Hocko wrote: >>> On Thu 25-07-19 16:35:07, David Hildenbrand wrote: >>>> On 25.07.19 15:57, Michal Hocko wrote: >>>>> On Thu 25-07-19 15:05:02, David Hildenbrand wrote: >>>>>> On 25.07.19 14:56, Michal Hocko wrote: >>>>>>> On Wed 24-07-19 16:30:17, David Hildenbrand wrote: >>>>>>>> We end up calling __add_memory() without the device hotplug lock held. >>>>>>>> (I used a local patch to assert in __add_memory() that the >>>>>>>> device_hotplug_lock is held - I might upstream that as well soon) >>>>>>>> >>>>>>>> [ 26.771684] create_memory_block_devices+0xa4/0x140 >>>>>>>> [ 26.772952] add_memory_resource+0xde/0x200 >>>>>>>> [ 26.773987] __add_memory+0x6e/0xa0 >>>>>>>> [ 26.775161] acpi_memory_device_add+0x149/0x2b0 >>>>>>>> [ 26.776263] acpi_bus_attach+0xf1/0x1f0 >>>>>>>> [ 26.777247] acpi_bus_attach+0x66/0x1f0 >>>>>>>> [ 26.778268] acpi_bus_attach+0x66/0x1f0 >>>>>>>> [ 26.779073] acpi_bus_attach+0x66/0x1f0 >>>>>>>> [ 26.780143] acpi_bus_scan+0x3e/0x90 >>>>>>>> [ 26.780844] acpi_scan_init+0x109/0x257 >>>>>>>> [ 26.781638] acpi_init+0x2ab/0x30d >>>>>>>> [ 26.782248] do_one_initcall+0x58/0x2cf >>>>>>>> [ 26.783181] kernel_init_freeable+0x1bd/0x247 >>>>>>>> [ 26.784345] kernel_init+0x5/0xf1 >>>>>>>> [ 26.785314] ret_from_fork+0x3a/0x50 >>>>>>>> >>>>>>>> So perform the locking just like in acpi_device_hotplug(). >>>>>>> >>>>>>> While playing with the device_hotplug_lock, can we actually document >>>>>>> what it is protecting please? I have a bad feeling that we are adding >>>>>>> this lock just because some other code path does rather than with a good >>>>>>> idea why it is needed. This patch just confirms that. What exactly does >>>>>>> the lock protect from here in an early boot stage. >>>>>> >>>>>> We have plenty of documentation already >>>>>> >>>>>> mm/memory_hotplug.c >>>>>> >>>>>> git grep -C5 device_hotplug mm/memory_hotplug.c >>>>>> >>>>>> Also see >>>>>> >>>>>> Documentation/core-api/memory-hotplug.rst >>>>> >>>>> OK, fair enough. I was more pointing to a documentation right there >>>>> where the lock is declared because that is the place where people >>>>> usually check for documentation. The core-api documentation looks quite >>>>> nice. And based on that doc it seems that this patch is actually not >>>>> needed because neither the online/offline or cpu hotplug should be >>>>> possible that early unless I am missing something. >>>> >>>> I really prefer to stick to locking rules as outlined on the >>>> interfaces if it doesn't hurt. Why it is not needed is not clear. >>>> >>>>> >>>>>> Regarding the early stage: primarily lockdep as I mentioned. >>>>> >>>>> Could you add a lockdep splat that would be fixed by this patch to the >>>>> changelog for reference? >>>>> >>>> >>>> I have one where I enforce what's documented (but that's of course not >>>> upstream and therefore not "real" yet) >>> >>> Then I suppose to not add locking for something that is not a problem. >>> Really, think about it. People will look at this code and follow the >>> lead without really knowing why the locking is needed. >>> device_hotplug_lock has its purpose and if the code in question doesn't >>> need synchronization for the documented scenarios then the locking >>> simply shouldn't be there. Adding the lock just because of a >>> non-existing, and IMHO dubious, lockdep splats is just wrong. >>> >>> We need to rationalize the locking here, not to add more hacks. >> >> No, sorry. The real hack is calling a function that is *documented* to >> be called under lock without it. That is an optimization for a special >> case. That is the black magic in the code. >> >> The only alternative I see to this patch is adding a comment like >> >> /* >> * We end up calling __add_memory() without the device_hotplug_lock >> * held. This is fine as we cannot race with other hotplug activities >> * and userspace trying to online memory blocks. >> */ >> >> Personally, I don't think that's any better than just grabbing the lock >> as we are told to. (honestly, I don't see how optimizing away the lock >> here is of *any* help to optimize our overall memory hotplug locking) >> >> @Rafael, what's your take? lock or comment? > > Well, I have ACKed your patch already. :-) It's never to late to un-ACK if you changed your mind :) > > That said, adding a comment stating that the lock is acquired mostly > for consistency wouldn't hurt. > I can certainly do that. Thanks! -- Thanks, David / dhildenb