Hello Mike, On Wed, Jan 30, 2019 at 10:13:36AM +0200, Mike Rapoport wrote: > We (CRIU) have some concerns about obsoleting soft-dirty in favor of > uffd-wp. If there are other soft-dirty users these concerns would be > relevant to them as well. > > With soft-dirty we collect the information about the changed memory every > pre-dump iteration in the following manner: > * freeze the tasks > * find entries in /proc/pid/pagemap with SOFT_DIRTY set > * unfreeze the tasks > * dump the modified pages to disk/remote host > > While we do need to traverse the /proc/pid/pagemap to identify dirty pages, > in between the pre-dump iterations and during the actual memory dump the > tasks are running freely. > > If we are to switch to uffd-wp, every write by the snapshotted/migrated > task will incur latency of uffd-wp processing by the monitor. That's valid concern indeed. I didn't go into the details of what additional feature is needed in addition to what is already present present in Peter's current patchset, but you're correct that in order to perform well to do the softdirty equivalent, we'll also need to add an async event model. The async event model would be set during UFFD registration. It'd work like async signals, you just queue up uffd events in the kernel by allocating them with a slab object (not in the kernel stack of the faulting process). Only if the monitor won't read() them fast enough it'll eventually block the write protect fault and release the mmap_sem but the page fault would always be resolved by the kernel even in that case. For the monitor there'll be just a stream of uffd_msg structures to read in multiples of the uffd_msg structure size with a single syscall per wakeup of the monitor. Conceptually it'd work the same as how PML works for EPT. The main downside will be an allocation per fault (soft dirty doesn't need to do such allocation), but there will be no round-trip to userland latency added to the wrprotect fault that needs to be logged. We need the synchronous/blocking uffd-wp for other things that aren't related to soft dirty and can't be achieved with an async model like softdirty. Adding an async model later would be a self contained feature inside uffd. So the idea would be to ignore any comparison with softdirty until uffd-wp is finalized, and then evaluate the possibility of adding an async model which would be simple thing to add in comparison of the uffd-wp feature itself. The theoretical expectation would be that softdirty would perform better for small processes (but for those the overall logging overhead is small anyway), but when it gets to the hundred-gigabytes/terabytes regions, async uffd-wp should perform much better. Thanks, Andrea