On 19.10.20 09:54, Wei Yang wrote: > On Mon, Oct 12, 2020 at 02:53:23PM +0200, David Hildenbrand wrote: >> Let's add a safe mechanism to unplug memory, avoiding long/endless loops >> when trying to offline memory - similar to in SBM. >> >> Fake-offline all memory (via alloc_contig_range()) before trying to >> offline+remove it. Use this mode as default, but allow to enable the other >> mode explicitly (which could give better memory hotunplug guarantees in > > I don't get the point how unsafe mode would have a better guarantees? It's primarily only relevant when there is a lot of concurrent action going on while unplugging memory. Using alloc_contig_range() on ZONE_MOVABLE can fail more easily than memory offlining. alloc_contig_range() doesn't try as hard as memory offlining code to isolate memory. There are known issues with temporary page pinning (e.g., when a process dies) and the PCP. (mostly discovered via CMA allocations) See the TODO I add in patch #14. [...] >> >> + if (bbm_safe_unplug) { >> + /* >> + * Start by fake-offlining all memory. Once we marked the device >> + * block as fake-offline, all newly onlined memory will >> + * automatically be kept fake-offline. Protect from concurrent >> + * onlining/offlining until we have a consistent state. >> + */ >> + mutex_lock(&vm->hotplug_mutex); >> + virtio_mem_bbm_set_bb_state(vm, bb_id, >> + VIRTIO_MEM_BBM_BB_FAKE_OFFLINE); >> + > > State is set here. > >> + for (pfn = start_pfn; pfn < end_pfn; pfn += PAGES_PER_SECTION) { >> + page = pfn_to_online_page(pfn); >> + if (!page) >> + continue; >> + >> + rc = virtio_mem_fake_offline(pfn, PAGES_PER_SECTION); >> + if (rc) { >> + end_pfn = pfn; >> + goto rollback_safe_unplug; >> + } >> + } >> + mutex_unlock(&vm->hotplug_mutex); >> + } >> + >> rc = virtio_mem_bbm_offline_and_remove_bb(vm, bb_id); >> - if (rc) >> + if (rc) { >> + if (bbm_safe_unplug) { >> + mutex_lock(&vm->hotplug_mutex); >> + goto rollback_safe_unplug; >> + } >> return rc; >> + } >> >> rc = virtio_mem_bbm_unplug_bb(vm, bb_id); >> if (rc) > > And changed to PLUGGED or UNUSED based on rc. Right, after offlining+remove succeeded. So no longer added to Linux. The final state depends on the success of the unplug request towards the hypervisor. > >> @@ -1987,6 +2069,17 @@ static int virtio_mem_bbm_offline_remove_and_unplug_bb(struct virtio_mem *vm, >> virtio_mem_bbm_set_bb_state(vm, bb_id, >> VIRTIO_MEM_BBM_BB_UNUSED); >> return rc; >> + >> +rollback_safe_unplug: >> + for (pfn = start_pfn; pfn < end_pfn; pfn += PAGES_PER_SECTION) { >> + page = pfn_to_online_page(pfn); >> + if (!page) >> + continue; >> + virtio_mem_fake_online(pfn, PAGES_PER_SECTION); >> + } >> + virtio_mem_bbm_set_bb_state(vm, bb_id, VIRTIO_MEM_BBM_BB_ADDED); > > And changed to ADDED if failed. Right, back to the initial state when entering this function. > >> + mutex_unlock(&vm->hotplug_mutex); >> + return rc; >> } > > So in which case, the bbm state is FAKE_OFFLINE during > virtio_mem_bbm_notify_going_offline() and > virtio_mem_bbm_notify_cancel_offline() ? Exactly, so we can do our magic with fake-offline pages and our virtio_mem_bbm_offline_and_remove_bb() can actually succeed. -- Thanks, David / dhildenb