Hi Peter, > -----Original Message----- > From: Peter Xu [mailto:peterx@xxxxxxxxxx] > Sent: Wednesday, February 19, 2020 1:26 AM > To: Zhoujian (jay) <jianjay.zhou@xxxxxxxxxx> > Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx>; kvm@xxxxxxxxxxxxxxx; wangxin (U) > <wangxinxin.wang@xxxxxxxxxx>; linfeng (M) <linfeng23@xxxxxxxxxx>; > Huangweidong (C) <weidong.huang@xxxxxxxxxx>; Liujinsong (Paul) > <liu.jinsong@xxxxxxxxxx> > Subject: Re: [PATCH] KVM: x86: enable dirty log gradually in small chunks > > On Tue, Feb 18, 2020 at 01:39:36PM +0000, Zhoujian (jay) wrote: > > Hi Paolo, > > > > > -----Original Message----- > > > From: Paolo Bonzini [mailto:pbonzini@xxxxxxxxxx] > > > Sent: Tuesday, February 18, 2020 7:40 PM > > > To: Zhoujian (jay) <jianjay.zhou@xxxxxxxxxx>; kvm@xxxxxxxxxxxxxxx > > > Cc: peterx@xxxxxxxxxx; wangxin (U) <wangxinxin.wang@xxxxxxxxxx>; > > > linfeng (M) <linfeng23@xxxxxxxxxx>; Huangweidong (C) > > > <weidong.huang@xxxxxxxxxx> > > > Subject: Re: [PATCH] KVM: x86: enable dirty log gradually in small > > > chunks > > > > > > On 18/02/20 12:00, Jay Zhou wrote: > > > > It could take kvm->mmu_lock for an extended period of time when > > > > enabling dirty log for the first time. The main cost is to clear > > > > all the D-bits of last level SPTEs. This situation can benefit > > > > from manual dirty log protect as well, which can reduce the mmu_lock > time taken. > > > > The sequence is like this: > > > > > > > > 1. Set all the bits of the first dirty bitmap to 1 when enabling > > > > dirty log for the first time > > > > 2. Only write protect the huge pages 3. KVM_GET_DIRTY_LOG returns > > > > the dirty bitmap info 4. > > > > KVM_CLEAR_DIRTY_LOG will clear D-bit for each of the leaf level > > > > SPTEs gradually in small chunks > > > > > > > > Under the Intel(R) Xeon(R) Gold 6152 CPU @ 2.10GHz environment, I > > > > did some tests with a 128G windows VM and counted the time taken > > > > of memory_global_dirty_log_start, here is the numbers: > > > > > > > > VM Size Before After optimization > > > > 128G 460ms 10ms > > > > > > This is a good idea, but could userspace expect the bitmap to be 0 > > > for pages that haven't been touched? > > > > The userspace gets the bitmap information only from the kernel side. > > It depends on the kernel side to distinguish whether the pages have > > been touched I think, which using the rmap to traverse for now. I > > haven't the other ideas yet, :-( > > > > But even though the userspace gets 1 for pages that haven't been > > touched, these pages will be filtered out too in the kernel space > > KVM_CLEAR_DIRTY_LOG ioctl path, since the rmap does not exist I think. > > > > > I think this should be added as a new bit to the KVM_ENABLE_CAP for > > > KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2. That is: > > > > > > - in kvm_vm_ioctl_check_extension_generic, return 3 for > > > KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 (better: define two constants > > > KVM_DIRTY_LOG_MANUAL_PROTECT as 1 and > KVM_DIRTY_LOG_INITIALLY_SET as > > > 2). > > > > > > - in kvm_vm_ioctl_enable_cap_generic, allow bit 0 and bit 1 for > > > cap->args[0] > > > > > > - in kvm_vm_ioctl_enable_cap_generic, check "if > > > (!(kvm->manual_dirty_log_protect & KVM_DIRTY_LOG_INITIALLY_SET))". > > > > Thanks for the details! I'll add them in the next version. > > I agree with Paolo that we'd better introduce a new bit for the change, because > we don't know whether userspace has the assumption with a zeroed dirty > bitmap as initial state (which is still part of the kernel ABI IIUC, actually that > could be a good thing for some userspace). > > Another question is that I see you only modified the PML path. Could this also > benefit the rest (say, SPTE write protects)? Oh I missed the other path, thanks for reminding, I'll add it in V2. Regards, Jay Zhou > > Thanks, > > -- > Peter Xu