On Tue, Mar 07, 2023 at 05:11:07PM +0100, David Hildenbrand wrote: > On 06.03.23 22:39, Peter Xu wrote: > > Note that I wodnered for a second if we'd call it "UFFD_FEATURE_WP_MISSING" > instead (similar to the definition of MISSING uffd that triggers when we > have nothing mapped). I'll be fine with either WP_UNPOPULATED or WP_ZEROPAGE, but hope to avoid MISSING as that's definitely a keyword in uffd which can be confusing. > > Just a thought. > > [...] > > > With WP_UNPOPUATED, application like QEMU can avoid pre-read faults all the > > memory before wr-protect during taking a live snapshot. Quotting from > > Muhammad's test result here [3] based on a simple program [4]: > > > > (1) With huge page disabled > > echo madvise > /sys/kernel/mm/transparent_hugepage/enabled > > ./uffd_wp_perf > > Test DEFAULT: 4 > > Test PRE-READ: 1111453 (pre-fault 1101011) > > Test MADVISE: 278276 (pre-fault 266378) > > Test WP-UNPOPULATE: 11712 > > > > (2) With Huge page enabled > > echo always > /sys/kernel/mm/transparent_hugepage/enabled > > ./uffd_wp_perf > > Test DEFAULT: 4 > > Test PRE-READ: 22521 (pre-fault 22348) > > Test MADVISE: 4909 (pre-fault 4743) > > Test WP-UNPOPULATE: 14448 > > > > There'll be a great perf boost for no-thp case, while for thp enabled with > > extreme case of all-thp-zero WP_UNPOPULATED can be slower than MADVISE, but > > that's low possibility in reality, also the overhead was not reduced but > > postponed until a follow up write on any huge zero thp, so potentitially it > > s/potentitially/potentially/ > > > is faster by making the follow up writes slower. > > What I realized, interrestingly not only the writes, but also the reads. In Curious why reading a zeropage would be a problem? > case of background snapshots we'll be reading all VM memory I think ... but > we could optimize in QEMU by consulting the pagemap if there is anything > mapped at all, and not read zeros in that case [an optimization brought up > several times already]. > > I am not sure yet if we want to change the QEMU implementation. But anyhow, > that's a different discussion. The rest comments all look good here, I'll fix and resend, thanks. -- Peter Xu