On Thu, Oct 27, 2022, Paolo Bonzini wrote: > So, use the short size at activation time as well. This means > re-activating the cache if the guest requests the hypercall page > multiple times with different word sizes (this can happen when > kexec-ing, for example). I don't understand the motivation for allowing a conditionally valid GPA. I see a lot of complexity and sub-optimal error handling for a use case that no one cares about. Realistically, userspace is never going to provide a GPA that only works some of the time, because doing otherwise is just asking for a dead guest. > +static int kvm_xen_reactivate_runstate_gpcs(struct kvm *kvm) > +{ > + struct kvm_vcpu *vcpu; > + unsigned long i; > + > + kvm_for_each_vcpu(i, vcpu, kvm) { > + if (vcpu->arch.xen.runstate_cache.active) { This is not safe when called from kvm_xen_write_hypercall_page(), which doesn't acquire kvm->lock and thus doesn't guard against a concurrent call via kvm_xen_vcpu_set_attr(). That's likely a bug in itself, but even if that issue is fixed, I don't see how this is yields a better uAPI than forcing userspace to provide an address that is valid for all modes. If the address becomes bad when the guest changes the hypercall page, the guest is completely hosed through no fault of its own, whereas limiting the misaligned detection to KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_ADDR means that any "bad" address will result in immediate failure, i.e. makes it so that errors are 100% userspace misconfiguration bugs. > + int r = kvm_xen_activate_runstate_gpc(vcpu, > + vcpu->arch.xen.runstate_cache.gpa); > + if (r < 0) Returning immediately is wrong, as later vCPUs will have a valid, active cache that hasn't been verified for 64-bit mode. > + return r; > + } > + } > + return 0; > +} > + > void kvm_xen_update_runstate_guest(struct kvm_vcpu *v, int state) > { > struct kvm_vcpu_xen *vx = &v->arch.xen; > @@ -212,11 +243,7 @@ void kvm_xen_update_runstate_guest(struct kvm_vcpu *v, int state) > if (!vx->runstate_cache.active) > return; > > - if (IS_ENABLED(CONFIG_64BIT) && v->kvm->arch.xen.long_mode) > - user_len = sizeof(struct vcpu_runstate_info); > - else > - user_len = sizeof(struct compat_vcpu_runstate_info); > - > + user_len = kvm_xen_runstate_info_size(v->kvm); > read_lock_irqsave(&gpc->lock, flags); > while (!kvm_gfn_to_pfn_cache_check(v->kvm, gpc, gpc->gpa, > user_len)) { > @@ -461,7 +488,7 @@ int kvm_xen_hvm_set_attr(struct kvm *kvm, struct kvm_xen_hvm_attr *data) > mutex_lock(&kvm->lock); > kvm->arch.xen.long_mode = !!data->u.long_mode; > mutex_unlock(&kvm->lock); > - r = 0; > + r = kvm_xen_reactivate_runstate_gpcs(kvm); Needs to be called under kvm->lock. This path also needs to acquire kvm->srcu. > } > break; > > @@ -596,9 +623,7 @@ int kvm_xen_vcpu_set_attr(struct kvm_vcpu *vcpu, struct kvm_xen_vcpu_attr *data) > break; > } > > - r = kvm_gpc_activate(vcpu->kvm, &vcpu->arch.xen.runstate_cache, > - NULL, KVM_HOST_USES_PFN, data->u.gpa, > - sizeof(struct vcpu_runstate_info)); > + r = kvm_xen_activate_runstate_gpc(vcpu, data->u.gpa); > break; > > case KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_CURRENT: > @@ -843,9 +868,13 @@ int kvm_xen_write_hypercall_page(struct kvm_vcpu *vcpu, u64 data) > u32 page_num = data & ~PAGE_MASK; > u64 page_addr = data & PAGE_MASK; > bool lm = is_long_mode(vcpu); > + int r; > > /* Latch long_mode for shared_info pages etc. */ > vcpu->kvm->arch.xen.long_mode = lm; > + r = kvm_xen_reactivate_runstate_gpcs(kvm); > + if (r < 0) > + return 1; Aren't we just making up behavior at this point? Injecting a #GP into the guest for what is a completely legal operation from the guest's perspective seems all kinds of wrong.