We get the MEM_ONLINE notifier call if memory is added right from the kernel via add_memory() or later from user space. Let's get rid of the "ha_waiting" flag - the wait event has an inbuilt mechanism (->done) for that. Initialize the wait event only once and reinitialize before adding memory. Unconditionally call complete() and wait_for_completion_timeout(). If there are no waiters, complete() will only increment ->done - which will be reset by reinit_completion(). If complete() has already been called, wait_for_completion_timeout() will not wait. There is still the chance for a small race between concurrent reinit_completion() and complete(). If complete() wins, we would not wait - which is tolerable (and the race exists in current code as well). Note: We only wait for "some" memory to get onlined, which seems to be good enough for now. Reviewed-by: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx> Cc: "K. Y. Srinivasan" <kys@xxxxxxxxxxxxx> Cc: Haiyang Zhang <haiyangz@xxxxxxxxxxxxx> Cc: Stephen Hemminger <sthemmin@xxxxxxxxxxxxx> Cc: Wei Liu <wei.liu@xxxxxxxxxx> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> Cc: Michal Hocko <mhocko@xxxxxxxxxx> Cc: Oscar Salvador <osalvador@xxxxxxx> Cc: "Rafael J. Wysocki" <rafael@xxxxxxxxxx> Cc: Baoquan He <bhe@xxxxxxxxxx> Cc: Wei Yang <richard.weiyang@xxxxxxxxx> Cc: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx> Cc: linux-hyperv@xxxxxxxxxxxxxxx Signed-off-by: David Hildenbrand <david@xxxxxxxxxx> --- drivers/hv/hv_balloon.c | 25 ++++++++++--------------- 1 file changed, 10 insertions(+), 15 deletions(-) diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c index a02ce43d778d..32e3bc0aa665 100644 --- a/drivers/hv/hv_balloon.c +++ b/drivers/hv/hv_balloon.c @@ -533,7 +533,6 @@ struct hv_dynmem_device { * State to synchronize hot-add. */ struct completion ol_waitevent; - bool ha_waiting; /* * This thread handles hot-add * requests from the host as well as notifying @@ -634,10 +633,7 @@ static int hv_memory_notifier(struct notifier_block *nb, unsigned long val, switch (val) { case MEM_ONLINE: case MEM_CANCEL_ONLINE: - if (dm_device.ha_waiting) { - dm_device.ha_waiting = false; - complete(&dm_device.ol_waitevent); - } + complete(&dm_device.ol_waitevent); break; case MEM_OFFLINE: @@ -726,8 +722,7 @@ static void hv_mem_hot_add(unsigned long start, unsigned long size, has->covered_end_pfn += processed_pfn; spin_unlock_irqrestore(&dm_device.ha_lock, flags); - init_completion(&dm_device.ol_waitevent); - dm_device.ha_waiting = !memhp_auto_online; + reinit_completion(&dm_device.ol_waitevent); nid = memory_add_physaddr_to_nid(PFN_PHYS(start_pfn)); ret = add_memory(nid, PFN_PHYS((start_pfn)), @@ -753,15 +748,14 @@ static void hv_mem_hot_add(unsigned long start, unsigned long size, } /* - * Wait for the memory block to be onlined when memory onlining - * is done outside of kernel (memhp_auto_online). Since the hot - * add has succeeded, it is ok to proceed even if the pages in - * the hot added region have not been "onlined" within the - * allowed time. + * Wait for memory to get onlined. If the kernel onlined the + * memory when adding it, this will return directly. Otherwise, + * it will wait for user space to online the memory. This helps + * to avoid adding memory faster than it is getting onlined. As + * adding succeeded, it is ok to proceed even if the memory was + * not onlined in time. */ - if (dm_device.ha_waiting) - wait_for_completion_timeout(&dm_device.ol_waitevent, - 5*HZ); + wait_for_completion_timeout(&dm_device.ol_waitevent, 5 * HZ); post_status(&dm_device); } } @@ -1706,6 +1700,7 @@ static int balloon_probe(struct hv_device *dev, #ifdef CONFIG_MEMORY_HOTPLUG set_online_page_callback(&hv_online_page); + init_completion(&dm_device.ol_waitevent); register_memory_notifier(&hv_memory_nb); #endif -- 2.24.1