On 18.10.20 05:57, Wei Yang wrote: > On Fri, Oct 16, 2020 at 11:18:39AM +0200, David Hildenbrand wrote: >> On 16.10.20 06:03, Wei Yang wrote: >>> On Mon, Oct 12, 2020 at 02:53:03PM +0200, David Hildenbrand wrote: >>>> Let's trigger from offlining code when we're not allowed to touch online > > Here "touch" means "unplug"? If so, maybe s/touch/unplug/ would be more easy > to understand. Yes, much better. [...] > I am trying to get more understanding about the logic of virtio_mem_retry(). > > Current logic seems clear to me. There are four places to trigger it: > > * notify_offline > * notify_online > * timer_expired > * config_changed > > In this patch, we try to optimize the first case, notify_offline. Yes. > > Now, we would always trigger retry when one of our memory block get offlined. > Per my understanding, this logic is correct while missed one case (or be more > precise, not handle one case timely). The case this patch wants to improve is > virtio_mem_mb_remove(). If my understanding is correct. > Yes, that's one part of it. Read below. > virtio_mem_run_wq() > virtio_mem_unplug_request() > virtio_mem_mb_unplug_any_sb_offline() > virtio_mem_mb_remove() --- 1 > virtio_mem_mb_unplug_any_sb_online() > virtio_mem_mb_offline_and_remove() --- 2 > > The above is two functions this patch adjusts. For 2), it will offline the > memory block, thus will trigger virtio_mem_retry() originally. But for 1), the > memory block is already offlined, so virtio_mem_retry() will not be triggered > originally. This is the case we want to improve in this patch. Instead of wait > for timer expire, we trigger retry immediately after unplug/remove an offlined > memory block. > > And after this change, this patch still adjust the original > virtio_mem_notify_offline() path to just trigger virtio_mem_retry() when > unplug_online is false. (This means the offline event is notified from user > space instead of from unplug event). > > If my above analysis is correct, I got one small suggestion for this patch. > Instead of adjust current notify_offline handling, how about just trigger > retry during virtio_mem_mb_remove()? Since per my understanding, we just want > to do immediate trigger retry when unplug an offlined memory block. I probably should have added the following to the patch description: "This is a preparation for Big Block Mode (BBM), whereby we can see some temporary offlining of memory blocks without actually making progress" Imagine you have a Big Block that spans to Linux memory blocks. Assume the first Linux memory blocks has no unmovable data on it. Assume you call offline_and_remove_memory() 1. Try to offline the first block. Works, notifiers triggered. virtio_mem_retry(). 2. Try to offline the second block. Does not work. 3. Re-online first block. 4. Exit to main loop, exit workqueue. 5. Retry immediately (due to virtio_mem_retry()), go to 1. So, you'll keep retrying forever. Found while debugging that exact issue :) -- Thanks, David / dhildenb