On Mon, Oct 31, 2022 at 08:36:16AM +0800, Gavin Shan wrote: > ARM64 needs to dirty memory outside of a VCPU context when VGIC/ITS is > enabled. It's conflicting with that ring-based dirty page tracking always > requires a running VCPU context. > > Introduce a new flavor of dirty ring that requires the use of both VCPU > dirty rings and a dirty bitmap. The expectation is that for non-VCPU > sources of dirty memory (such as the VGIC/ITS on arm64), KVM writes to > the dirty bitmap. Userspace should scan the dirty bitmap before migrating > the VM to the target. > > Use an additional capability to advertise this behavior. The newly added > capability (KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP) can't be enabled before > KVM_CAP_DIRTY_LOG_RING_ACQ_REL on ARM64. In this way, the newly added > capability is treated as an extension of KVM_CAP_DIRTY_LOG_RING_ACQ_REL. Whatever ordering requirements we settle on between these capabilities needs to be documented as well. [...] > @@ -4588,6 +4594,13 @@ static int kvm_vm_ioctl_enable_cap_generic(struct kvm *kvm, > return -EINVAL; > > return kvm_vm_ioctl_enable_dirty_log_ring(kvm, cap->args[0]); > + case KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP: > + if (!IS_ENABLED(CONFIG_HAVE_KVM_DIRTY_RING_WITH_BITMAP) || > + !kvm->dirty_ring_size) I believe this ordering requirement is problematic, as it piles on top of an existing problem w.r.t. KVM_CAP_DIRTY_LOG_RING v. memslot creation. Example: - Enable KVM_CAP_DIRTY_LOG_RING - Create some memslots w/ dirty logging enabled (note that the bitmap is _not_ allocated) - Enable KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP - Save ITS tables and get a NULL dereference in mark_page_dirty_in_slot(): if (vcpu && kvm->dirty_ring_size) kvm_dirty_ring_push(&vcpu->dirty_ring, slot, rel_gfn); else -------> set_bit_le(rel_gfn, memslot->dirty_bitmap); Similarly, KVM may unnecessarily allocate bitmaps if dirty logging is enabled on memslots before KVM_CAP_DIRTY_LOG_RING is enabled. You could paper over this issue by disallowing DIRTY_RING_WITH_BITMAP if DIRTY_LOG_RING has already been enabled, but the better approach would be to explicitly check kvm_memslots_empty() such that the real dependency is obvious. Peter, hadn't you mentioned something about checking against memslots in an earlier revision? -- Thanks, Oliver