On Wed, 5 Feb 2014, Nathan Zimmer wrote: > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c > index 62a0cd1..a3cbd14 100644 > --- a/mm/memory_hotplug.c > +++ b/mm/memory_hotplug.c > @@ -985,12 +985,12 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ > if (need_zonelists_rebuild) > zone_pcp_reset(zone); > mutex_unlock(&zonelists_mutex); > + unlock_memory_hotplug(); > printk(KERN_DEBUG "online_pages [mem %#010llx-%#010llx] failed\n", > (unsigned long long) pfn << PAGE_SHIFT, > (((unsigned long long) pfn + nr_pages) > << PAGE_SHIFT) - 1); > memory_notify(MEM_CANCEL_ONLINE, &arg); > - unlock_memory_hotplug(); > return ret; > } > > @@ -1016,9 +1016,10 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ > > writeback_set_ratelimit(); > > + unlock_memory_hotplug(); > + > if (onlined_pages) > memory_notify(MEM_ONLINE, &arg); > - unlock_memory_hotplug(); > > return 0; > } That looks a little problematic, what happens if a nid is being brought online and a registered callback does something like allocate resources for the arg->status_change_nid and the above two hunks of this patch end up racing? Before, a registered callback would be guaranteed to see either a MEMORY_CANCEL_ONLINE or MEMORY_ONLINE after it has already done MEMORY_GOING_ONLINE. With your patch, we could race and see one cpu doing MEMORY_GOING_ONLINE, another cpu doing MEMORY_GOING_ONLINE, and then MEMORY_ONLINE and MEMORY_CANCEL_ONLINE in either order. So I think this patch will break most registered callbacks that actually depend on lock_memory_hotplug(), it's a coarse lock for that reason. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>