On Fri, Oct 07, 2022 at 07:38:19AM +0800, Gavin Shan wrote: > Hi Peter, Hi, Gavin, > > On 10/7/22 4:28 AM, Peter Xu wrote: > > On Wed, Oct 05, 2022 at 08:41:50AM +0800, Gavin Shan wrote: > > > -8.29 KVM_CAP_DIRTY_LOG_RING/KVM_CAP_DIRTY_LOG_RING_ACQ_REL > > > ----------------------------------------------------------- > > > +8.29 KVM_CAP_DIRTY_LOG_{RING, RING_ACQ_REL, RING_ALLOW_BITMAP} > > > +-------------------------------------------------------------- > > > > Shall we make it a standalone cap, just to rely on DIRTY_RING[_ACQ_REL] > > being enabled first, instead of making the three caps at the same level? > > > > E.g. we can skip creating bitmap for DIRTY_RING[_ACQ_REL] && !_ALLOW_BITMAP > > (x86). > > > > Both KVM_CAP_DIRTY_LOG_RING and KVM_CAP_DIRTY_LONG_RING_ACQ_REL are available > to x86. So KVM_CAP_DIRTY_LONG_RING_ACQ_REL can be enabled on x86 in theory. > However, the idea to disallow bitmap for KVM_CAP_DIRTY_LOG_RING (x86) makes > sense to me. I think you may be suggesting something like below. > > - bool struct kvm::dirty_ring_allow_bitmap > > - In kvm_vm_ioctl_enable_dirty_log_ring(), set 'dirty_ring_allow_bitmap' to > true when the capability is KVM_CAP_DIRTY_LONG_RING_ACQ_REL What I wanted to do is to decouple the ACQ_REL with ALLOW_BITMAP, so mostly as what you suggested, except.. > > static int kvm_vm_ioctl_enable_dirty_log_ring(struct kvm *kvm, u32 cap, u32 size) > { > : > mutex_lock(&kvm->lock); > > if (kvm->created_vcpus) { > /* We don't allow to change this value after vcpu created */ > r = -EINVAL; > } else { > kvm->dirty_ring_size = size; .. here I'd not set dirty_ring_allow_bitmap at all so I'd drop below line, instead.. > kvm->dirty_ring_allow_bitmap = (cap == KVM_CAP_DIRTY_LOG_RING_ACQ_REL); > r = 0; > } > > mutex_unlock(&kvm->lock); > return r; > } > - In kvm_vm_ioctl_check_extension_generic(), KVM_CAP_DIRTY_LOG_RING_ALLOW_BITMAP > is always flase until KVM_CAP_DIRTY_LOG_RING_ACQ_REL is enabled. > > static long kvm_vm_ioctl_check_extension_generic(...) > { > : > case KVM_CAP_DIRTY_LOG_RING_ALLOW_BITMAP: > return kvm->dirty_ring_allow_bitmap ? 1 : 0; ... here we always return 1, OTOH in kvm_vm_ioctl_enable_cap_generic(): case KVM_CAP_DIRTY_LOG_RING_ALLOW_BITMAP: if (kvm->dirty_ring_size) return -EINVAL; kvm->dirty_ring_allow_bitmap = true; return 0; A side effect of checking dirty_ring_size is then we'll be sure to have no vcpu created too. Maybe we should also check no memslot created to make sure the bitmaps are not created. Then if the userspace wants to use the bitmap altogether with the ring, it needs to first detect KVM_CAP_DIRTY_LOG_RING_ALLOW_BITMAP and enable it before it enables KVM_CAP_DIRTY_LOG_RING. One trick on ALLOW_BITMAP is in mark_page_dirty_in_slot() - after we allow !vcpu case we'll need to make sure it won't accidentally try to set bitmap for !ALLOW_BITMAP, because in that case the bitmap pointer is NULL so set_bit_le() will directly crash the kernel. We could keep the old flavor of having a WARN_ON_ONCE(!vcpu && !ALLOW_BITMAP) then return, but since now the userspace can easily trigger this (e.g. on ARM, a malicious userapp can have DIRTY_RING && !ALLOW_BITMAP, then it can simply trigger the gic ioctl to trigger host warning), I think the better approach is we can kill the process in that case. Not sure whether there's anything better we can do. > } > > - The suggested dirty_ring_exclusive() is used. > > > > @@ -2060,10 +2060,6 @@ int kvm_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log, > > > unsigned long n; > > > unsigned long any = 0; > > > - /* Dirty ring tracking is exclusive to dirty log tracking */ > > > - if (kvm->dirty_ring_size) > > > - return -ENXIO; > > > > Then we can also have one dirty_ring_exclusive(), with something like: > > > > bool dirty_ring_exclusive(struct kvm *kvm) > > { > > return kvm->dirty_ring_size && !kvm->dirty_ring_allow_bitmap; > > } > > > > Does it make sense? Thanks, > > > > Thanks, > Gavin > -- Peter Xu