Hi, this is the v2 of the "moving dirty gitmaps to user space!" By this patch, I think everything we need becomes clear. So we want to step forward to be ready for the final version in the near future: of course, this is dependent on x86 and ppc asm issues. BTW, by whom I can get ACK for ppc and ia64? I want to add to the Cc list if possible, thank you. Patch1: introduce slot level dirty state management This patch is independent from other patches and seems to be useful without the following parts. Patch2: introduce wrapper functions to create and destroy dirty bitmaps Cleanup patch. Patch3: introduce a wrapper function to copy dirty bitmaps to user space This is for dealing copy_in_user() things cleanly. Patch4: change mark_page_dirty() to handle endian issues explicitly Later, __set_bit() part will be replaced with *_user function. Patch5: moving dirty bitmaps to user space Replace dirty bitmap manipulations with *_user functions. Patch6: introduce a new API for getting dirty bitmaps This is to access dirty bitmaps from user space. Changelog: - suport for all architectures We have achived this without pinning. - one possible API suggestion - temporary copy_in_user like function - temporary set_bit_user like function with __get_user() and __put_user() We can use this as a generic set_bit_user_non_atomic(). Of course, we need to optimize this part with arch specific one: we are testing some versions for x86 now. What we are thinking about: - about set_bit_user_non_atomic() We noticed that ia64 won't need this: see patch1 and patch5. So all we have to do is to complete the implementations for x86 and ppc. ** x86 and ppc don't include asm-generic uaccess. So we have to put these into them separately. - about the new api There are many possible styles to make use of this work. E.g. if we export the both addresses of the two bitmaps, we don't need to export them at the switch timing: but we cannot reuse the current structures in this case. Which is better? === Appendix: To test the patch 6, we are using the following patch for qemu-kvm. --- configure | 2 +- qemu-kvm.c | 22 +++++++++++++++++----- 2 files changed, 18 insertions(+), 6 deletions(-) diff --git a/configure b/configure index be8dac4..0b2d017 100755 --- a/configure +++ b/configure @@ -1498,7 +1498,7 @@ fi if test "$kvm" != "no" ; then cat > $TMPC <<EOF #include <linux/kvm.h> -#if !defined(KVM_API_VERSION) || KVM_API_VERSION < 12 || KVM_API_VERSION > 12 +#if !defined(KVM_API_VERSION) || KVM_API_VERSION < 13 || KVM_API_VERSION > 13 #error Invalid KVM version #endif #if !defined(KVM_CAP_USER_MEMORY) diff --git a/qemu-kvm.c b/qemu-kvm.c index cc5b352..087adea 100644 --- a/qemu-kvm.c +++ b/qemu-kvm.c @@ -44,7 +44,7 @@ #define BUS_MCEERR_AO 5 #endif -#define EXPECTED_KVM_API_VERSION 12 +#define EXPECTED_KVM_API_VERSION 13 #if EXPECTED_KVM_API_VERSION != KVM_API_VERSION #error libkvm: userspace and kernel version mismatch @@ -684,6 +684,21 @@ static int kvm_get_map(kvm_context_t kvm, int ioctl_num, int slot, void *buf) return 0; } +static int kvm_switch_map(kvm_context_t kvm, int slot, void **buf) +{ + int r; + struct kvm_dirty_log log = { + .slot = slot, + }; + + r = kvm_vm_ioctl(kvm_state, KVM_SWITCH_DIRTY_LOG, &log); + if (r < 0) + return r; + + *buf = (void *)log.addr; + return 0; +} + int kvm_get_dirty_pages(kvm_context_t kvm, unsigned long phys_addr, void *buf) { int slot; @@ -706,14 +721,11 @@ int kvm_get_dirty_pages_range(kvm_context_t kvm, unsigned long phys_addr, for (i = 0; i < KVM_MAX_NUM_MEM_REGIONS; ++i) { if ((slots[i].len && (uint64_t) slots[i].phys_addr >= phys_addr) && ((uint64_t) slots[i].phys_addr + slots[i].len <= end_addr)) { - buf = qemu_malloc(BITMAP_SIZE(slots[i].len)); - r = kvm_get_map(kvm, KVM_GET_DIRTY_LOG, i, buf); + r = kvm_switch_map(kvm, i, &buf); if (r) { - qemu_free(buf); return r; } r = cb(slots[i].phys_addr, slots[i].len, buf, opaque); - qemu_free(buf); if (r) return r; } -- 1.6.3.3 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html