On Mon, Oct 10, 2022 at 11:18:55PM +0000, Oliver Upton wrote: > On Fri, Oct 07, 2022 at 10:31:49AM -0400, Peter Xu wrote: > > [...] > > > > - In kvm_vm_ioctl_enable_dirty_log_ring(), set 'dirty_ring_allow_bitmap' to > > > true when the capability is KVM_CAP_DIRTY_LONG_RING_ACQ_REL > > > > What I wanted to do is to decouple the ACQ_REL with ALLOW_BITMAP, so mostly > > as what you suggested, except.. > > +1 > > > > > > > static int kvm_vm_ioctl_enable_dirty_log_ring(struct kvm *kvm, u32 cap, u32 size) > > > { > > > : > > > mutex_lock(&kvm->lock); > > > > > > if (kvm->created_vcpus) { > > > /* We don't allow to change this value after vcpu created */ > > > r = -EINVAL; > > > } else { > > > kvm->dirty_ring_size = size; > > > > .. here I'd not set dirty_ring_allow_bitmap at all so I'd drop below line, > > instead.. > > > > > kvm->dirty_ring_allow_bitmap = (cap == KVM_CAP_DIRTY_LOG_RING_ACQ_REL); > > > r = 0; > > > } > > > > > > mutex_unlock(&kvm->lock); > > > return r; > > > } > > > - In kvm_vm_ioctl_check_extension_generic(), KVM_CAP_DIRTY_LOG_RING_ALLOW_BITMAP > > > is always flase until KVM_CAP_DIRTY_LOG_RING_ACQ_REL is enabled. > > > > > > static long kvm_vm_ioctl_check_extension_generic(...) > > > { > > > : > > > case KVM_CAP_DIRTY_LOG_RING_ALLOW_BITMAP: > > > return kvm->dirty_ring_allow_bitmap ? 1 : 0; > > > > ... here we always return 1, OTOH in kvm_vm_ioctl_enable_cap_generic(): > > > > case KVM_CAP_DIRTY_LOG_RING_ALLOW_BITMAP: > > if (kvm->dirty_ring_size) > > return -EINVAL; > > kvm->dirty_ring_allow_bitmap = true; > > return 0; > > > > A side effect of checking dirty_ring_size is then we'll be sure to have no > > vcpu created too. Maybe we should also check no memslot created to make > > sure the bitmaps are not created. > > I'm not sure I follow... What prevents userspace from creating a vCPU > between enabling the two caps? Enabling of dirty ring requires no vcpu created, so as to make sure all the vcpus will have the ring structures allocated as long as ring enabled for the vm. Done in kvm_vm_ioctl_enable_dirty_log_ring(): if (kvm->created_vcpus) { /* We don't allow to change this value after vcpu created */ r = -EINVAL; } else { kvm->dirty_ring_size = size; r = 0; } Then if we have KVM_CAP_DIRTY_LOG_RING_ALLOW_BITMAP checking dirty_ring_size first then we make sure we need to configure both ALLOW_BITMAP and DIRTY_RING before any vcpu creation. > > > Then if the userspace wants to use the bitmap altogether with the ring, it > > needs to first detect KVM_CAP_DIRTY_LOG_RING_ALLOW_BITMAP and enable it > > before it enables KVM_CAP_DIRTY_LOG_RING. > > > > One trick on ALLOW_BITMAP is in mark_page_dirty_in_slot() - after we allow > > !vcpu case we'll need to make sure it won't accidentally try to set bitmap > > for !ALLOW_BITMAP, because in that case the bitmap pointer is NULL so > > set_bit_le() will directly crash the kernel. > > > > We could keep the old flavor of having a WARN_ON_ONCE(!vcpu && > > !ALLOW_BITMAP) then return, but since now the userspace can easily trigger > > this (e.g. on ARM, a malicious userapp can have DIRTY_RING && > > !ALLOW_BITMAP, then it can simply trigger the gic ioctl to trigger host > > warning), I think the better approach is we can kill the process in that > > case. Not sure whether there's anything better we can do. > > I don't believe !ALLOW_BITMAP && DIRTY_RING is a valid configuration for > arm64 given the fact that we'll dirty memory outside of a vCPU context. Yes it's not, but after Gavin's current series it'll be possible, IOW a malicious app can leverage this to trigger host warning, which is IMHO not wanted. > > Could ALLOW_BITMAP be a requirement of DIRTY_RING, thereby making > userspace fail fast? Otherwise (at least on arm64) your VM is DOA on the > target. With that the old WARN() could be preserved, as you suggest. It's just that x86 doesn't need the bitmap, so it'll be a pure waste there otherwise. It's not only about the memory that will be wasted (that's guest mem size / 32k), but also the sync() process for x86 will be all zeros and totally meaningless - note that the sync() of bitmap will be part of VM downtime in this case (we need to sync() after turning VM off), so it will make x86 downtime larger but without any benefit. > On > top of that there would no longer be a need to test for memslot creation > when userspace attempts to enable KVM_CAP_DIRTY_LOG_RING_ALLOW_BITMAP. -- Peter Xu _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/mailman/listinfo/kvmarm