On Wed, Apr 11, 2012 at 08:22:07PM +0900, Takuya Yoshikawa wrote: > I am now testing the following patch. > > Note: this technique is used in several subsystems, e.g. jbd. > > Although people tend to say that holding mmu_lock during get_dirty is > always a problem, my impression is slightly different. Other than potential performance improvement, the worst case scenario of holding mmu_lock for hundreds of milliseconds at the beginning of migration of huge guests must be fixed. > When we call get_dirty, most of hot memory pages have already been > written at least once and faults are becoming rare. > > Actually I rarely saw rescheduling due to mmu_lock contention when > I tested this patch locally -- though not enough. > > In contrast, if we do O(1), we need to write protect 511 pages soon > after the get_dirty and the chance of mmu_lock contention may increase > if multiple VCPUs try to write to memory. > > Anyway, this patch is small and seems effective. > > Takuya > > === > From: Takuya Yoshikawa <yoshikawa.takuya@xxxxxxxxxxxxx> > > get_dirty_log() needs to hold mmu_lock during write protecting dirty > pages and this can be long when there are many dirty pages to protect. > > As the guest can get faulted during that time, this may result in a > severe latency problem which would prevent the system to scale. > > This patch mitigates this by checking mmu_lock contention for every 2K > dirty pages we protect: we have selected this value since it took about > 100us to get 2K dirty pages. > > TODO: more numbers. > > Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@xxxxxxxxxxxxx> > --- > arch/x86/include/asm/kvm_host.h | 6 +++--- > arch/x86/kvm/mmu.c | 12 +++++++++--- > arch/x86/kvm/x86.c | 18 +++++++++++++----- > 3 files changed, 25 insertions(+), 11 deletions(-) > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > index f624ca7..26b39c1 100644 > --- a/arch/x86/include/asm/kvm_host.h > +++ b/arch/x86/include/asm/kvm_host.h > @@ -712,9 +712,9 @@ void kvm_mmu_set_mask_ptes(u64 user_mask, u64 accessed_mask, > > int kvm_mmu_reset_context(struct kvm_vcpu *vcpu); > void kvm_mmu_slot_remove_write_access(struct kvm *kvm, int slot); > -void kvm_mmu_write_protect_pt_masked(struct kvm *kvm, > - struct kvm_memory_slot *slot, > - gfn_t gfn_offset, unsigned long mask); > +int kvm_mmu_write_protect_pt_masked(struct kvm *kvm, > + struct kvm_memory_slot *slot, > + gfn_t gfn_offset, unsigned long mask); > void kvm_mmu_zap_all(struct kvm *kvm); > unsigned int kvm_mmu_calculate_mmu_pages(struct kvm *kvm); > void kvm_mmu_change_mmu_pages(struct kvm *kvm, unsigned int kvm_nr_mmu_pages); > diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c > index 29ad6f9..b88c5cc 100644 > --- a/arch/x86/kvm/mmu.c > +++ b/arch/x86/kvm/mmu.c > @@ -1081,20 +1081,26 @@ static int __rmap_write_protect(struct kvm *kvm, unsigned long *rmapp, int level > * > * Used when we do not need to care about huge page mappings: e.g. during dirty > * logging we do not have any such mappings. > + * > + * Returns the number of pages protected by this. > */ > -void kvm_mmu_write_protect_pt_masked(struct kvm *kvm, > - struct kvm_memory_slot *slot, > - gfn_t gfn_offset, unsigned long mask) > +int kvm_mmu_write_protect_pt_masked(struct kvm *kvm, > + struct kvm_memory_slot *slot, > + gfn_t gfn_offset, unsigned long mask) > { > unsigned long *rmapp; > + int nr_protected = 0; > > while (mask) { > rmapp = &slot->rmap[gfn_offset + __ffs(mask)]; > __rmap_write_protect(kvm, rmapp, PT_PAGE_TABLE_LEVEL); > + ++nr_protected; > > /* clear the first set bit */ > mask &= mask - 1; > } > + > + return nr_protected; > } > > static int rmap_write_protect(struct kvm *kvm, u64 gfn) > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 0d9a578..b636669 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -3092,7 +3092,7 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log) > unsigned long n, i; > unsigned long *dirty_bitmap; > unsigned long *dirty_bitmap_buffer; > - bool is_dirty = false; > + int nr_protected = 0; > > mutex_lock(&kvm->slots_lock); > > @@ -3121,15 +3121,23 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log) > if (!dirty_bitmap[i]) > continue; > > - is_dirty = true; > - > mask = xchg(&dirty_bitmap[i], 0); > dirty_bitmap_buffer[i] = mask; > > offset = i * BITS_PER_LONG; > - kvm_mmu_write_protect_pt_masked(kvm, memslot, offset, mask); > + nr_protected += kvm_mmu_write_protect_pt_masked(kvm, memslot, > + offset, mask); > + if (nr_protected > 2048) { Can you expand on the reasoning behind this? -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html