On Thu, Nov 13, 2014 at 05:57:47PM -0800, Mario Smarduch wrote: > Add support to track dirty pages between user space KVM_GET_DIRTY_LOG ioctl > calls. We call kvm_get_dirty_log_protect() function to do most of the work. > > Reviewed-by: Marc Zyngier <marc.zyngier@xxxxxxx> > Signed-off-by: Mario Smarduch <m.smarduch@xxxxxxxxxxx> > --- > arch/arm/kvm/Kconfig | 1 + > arch/arm/kvm/arm.c | 46 ++++++++++++++++++++++++++++++++++++++++++++++ > arch/arm/kvm/mmu.c | 22 ++++++++++++++++++++++ > 3 files changed, 69 insertions(+) > > diff --git a/arch/arm/kvm/Kconfig b/arch/arm/kvm/Kconfig > index f27f336..a8d1ace 100644 > --- a/arch/arm/kvm/Kconfig > +++ b/arch/arm/kvm/Kconfig > @@ -24,6 +24,7 @@ config KVM > select HAVE_KVM_ARCH_TLB_FLUSH_ALL > select KVM_MMIO > select KVM_ARM_HOST > + select KVM_GENERIC_DIRTYLOG_READ_PROTECT > depends on ARM_VIRT_EXT && ARM_LPAE > ---help--- > Support hosting virtualized guest machines. You will also > diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c > index a99e0cd..040c0f3 100644 > --- a/arch/arm/kvm/arm.c > +++ b/arch/arm/kvm/arm.c > @@ -737,9 +737,55 @@ long kvm_arch_vcpu_ioctl(struct file *filp, > } > } > > +/** > + * kvm_vm_ioctl_get_dirty_log - get and clear the log of dirty pages in a slot > + * @kvm: kvm instance > + * @log: slot id and address to which we copy the log > + * > + * We need to keep it in mind that VCPU threads can write to the bitmap > + * concurrently. So, to avoid losing data, we keep the following order for > + * each bit: > + * > + * 1. Take a snapshot of the bit and clear it if needed. > + * 2. Write protect the corresponding page. > + * 3. Copy the snapshot to the userspace. > + * 4. Flush TLB's if needed. > + * > + * Steps 1,2,3 are handled by kvm_get_dirty_log_protect(). > + * Between 2 and 4, the guest may write to the page using the remaining TLB > + * entry. This is not a problem because the page is reported dirty using > + * the snapshot taken before and step 4 ensures that writes done after > + * exiting to userspace will be logged for the next call. > + */ > int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log) > { > +#ifdef CONFIG_ARM > + int r; > + bool is_dirty = false; > + > + mutex_lock(&kvm->slots_lock); > + > + r = kvm_get_dirty_log_protect(kvm, log, &is_dirty); > + if (r) > + goto out; > + > + /* > + * kvm_get_dirty_log_protect() may fail and we may skip TLB flush > + * leaving few stale spte TLB entries which is harmless, given we're > + * just write protecting spte's, so few stale TLB's will be left in > + * original R/W state. And since the bitmap is corrupt userspace will > + * error out anyway (i.e. during migration or dirty page loging for s/loging/logging/ Hmmm, where is this behavior specified in the ABI? If you call KVM_GET_DIRTY_LOG subsequently, you will now potentially have unreported dirty pages, which can be completely avoided by removing the if-statement and the goto above. Why not simply do that and get rid of this comment? > + * other reasons) terminating dirty page logging. > + */ > + if (is_dirty) > + kvm_flush_remote_tlbs(kvm); > +out: > + mutex_unlock(&kvm->slots_lock); > + > + return r; > +#else /* ARM64 */ > return -EINVAL; > +#endif > } > > static int kvm_vm_ioctl_set_device_addr(struct kvm *kvm, > diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c > index 1e8b6a9..8137455 100644 > --- a/arch/arm/kvm/mmu.c > +++ b/arch/arm/kvm/mmu.c > @@ -870,6 +870,28 @@ void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot) > spin_unlock(&kvm->mmu_lock); > kvm_flush_remote_tlbs(kvm); > } > + > +/** > + * kvm_arch_mmu_write_protect_pt_masked() - write protect dirty pages > + * @kvm: The KVM pointer > + * @slot: The memory slot associated with mask > + * @gfn_offset: The gfn offset in memory slot > + * @mask: The mask of dirty pages at offset 'gfn_offset' in this memory > + * slot to be write protected > + * > + * Walks bits set in mask write protects the associated pte's. Caller must > + * acquire kvm_mmu_lock. > + */ > +void kvm_arch_mmu_write_protect_pt_masked(struct kvm *kvm, > + struct kvm_memory_slot *slot, > + gfn_t gfn_offset, unsigned long mask) > +{ > + phys_addr_t base_gfn = slot->base_gfn + gfn_offset; > + phys_addr_t start = (base_gfn + __ffs(mask)) << PAGE_SHIFT; > + phys_addr_t end = (base_gfn + __fls(mask) + 1) << PAGE_SHIFT; > + > + stage2_wp_range(kvm, start, end); > +} > #endif > > static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, > -- > 1.7.9.5 > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html