Hi Marc and Alex, On Tue, Jan 11, 2022 at 8:11 AM Alexandru Elisei <alexandru.elisei@xxxxxxx> wrote: > > Hi Marc, > > On Tue, Jan 11, 2022 at 01:30:41PM +0000, Marc Zyngier wrote: > > On Tue, 11 Jan 2022 07:37:57 +0000, > > Reiji Watanabe <reijiw@xxxxxxxxxx> wrote: > > > > > > Hi Alex, > > > > > > On Mon, Jan 10, 2022 at 2:48 AM Alexandru Elisei > > > <alexandru.elisei@xxxxxxx> wrote: > > > > > > > > Hi Reiji, > > > > > > > > On Sun, Jan 09, 2022 at 09:40:41PM -0800, Reiji Watanabe wrote: > > > > > vcpu_allowed_register_width() checks if all the VCPUs are either > > > > > all 32bit or all 64bit. Since the checking is done even for vCPUs > > > > > that are not initialized (KVM_ARM_VCPU_INIT has not been done) yet, > > > > > the non-initialized vCPUs are erroneously treated as 64bit vCPU, > > > > > which causes the function to incorrectly detect a mixed-width VM. > > > > > > > > > > Fix vcpu_allowed_register_width() to skip the check for vCPUs that > > > > > are not initialized yet. > > > > > > > > > > Fixes: 66e94d5cafd4 ("KVM: arm64: Prevent mixed-width VM creation") > > > > > Signed-off-by: Reiji Watanabe <reijiw@xxxxxxxxxx> > > > > > --- > > > > > arch/arm64/kvm/reset.c | 11 +++++++++++ > > > > > 1 file changed, 11 insertions(+) > > > > > > > > > > diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c > > > > > index 426bd7fbc3fd..ef78bbc7566a 100644 > > > > > --- a/arch/arm64/kvm/reset.c > > > > > +++ b/arch/arm64/kvm/reset.c > > > > > @@ -180,8 +180,19 @@ static bool vcpu_allowed_register_width(struct kvm_vcpu *vcpu) > > > > > if (kvm_has_mte(vcpu->kvm) && is32bit) > > > > > return false; > > > > > > > > > > + /* > > > > > + * Make sure vcpu->arch.target setting is visible from others so > > > > > + * that the width consistency checking between two vCPUs is done > > > > > + * by at least one of them at KVM_ARM_VCPU_INIT. > > > > > + */ > > > > > + smp_mb(); > > > > > > > > From ARM DDI 0487G.a, page B2-146 ("Data Memory Barrier (DMB)"): > > > > > > > > "The DMB instruction is a memory barrier instruction that ensures the relative > > > > order of memory accesses before the barrier with memory accesses after the > > > > barrier." > > > > > > > > I'm going to assume from the comment that you are referring to completion of > > > > memory accesses ("Make sure [..] is visible from others"). Please correct me if > > > > I am wrong. In this case, DMB ensures ordering of memory accesses with regards > > > > to writes and reads, not *completion*. Have a look at > > > > tools/memory-model/litmus-tests/MP+fencewmbonceonce+fencermbonceonce.litmus for > > > > the classic message passing example as an example of memory ordering. > > > > Message passing and other patterns are also explained in ARM DDI 0487G.a, page > > > > K11-8363. > > > > > > > > I'm not saying that your approach is incorrect, but the commit message should > > > > explain what memory accesses are being ordered relative to each other and why. > > > > > > Thank you so much for the review. > > > What I meant with the comment was: > > > --- > > > DMB is used to make sure that writing @vcpu->arch.target, which is done > > > by kvm_vcpu_set_target() before getting here, is visible to other PEs > > > before the following kvm_for_each_vcpu iteration reads the other vCPUs' > > > target field. > > > --- > > > Did the comment become more clear ?? (Or do I use DMB incorrectly ?) > > > > > > > > + > > > > > /* Check that the vcpus are either all 32bit or all 64bit */ > > > > > kvm_for_each_vcpu(i, tmp, vcpu->kvm) { > > > > > + /* Skip if KVM_ARM_VCPU_INIT is not done for the vcpu yet */ > > > > > + if (tmp->arch.target == -1) > > > > > + continue; > > > > > > I just noticed DMB(ishld) is needed here to assure ordering between > > > reading tmp->arch.target and reading vcpu->arch.features for this fix. > > > Similarly, kvm_vcpu_set_target() needs DMB(ishst) to assure ordering > > > between writing vcpu->arch.features and writing vcpu->arch.target... > > > I am going to fix them in the v2 series. > > > > Yes, you'd need at least this, and preferably in their smp_rmb/wmb > > variants. > > > > However, this looks like a pretty fragile construct, as there are > > multiple paths where we can change target (including some error > > paths from the run loop). > > > > I'd rather all changes to target and the feature bits happen under the > > kvm->lock, and take that lock when checking for consistency in > > vcpu_allowed_register_width(), as this isn't a fast path. I wrote the > > following, which is obviously incomplete and as usual untested. > > I think this is the better approach, because we also want to make sure that > a PE observes changes to target and features as soon as they have been > made, to avoid situations where one PE sets the target and the 32bit > feature, and another PE reads the old values and skips the check, in which > case memory ordering is not enough. Thank you for the comments and the suggestion. I will look into fixing this based on the suggestion. Thanks, Reiji > > Thanks, > Alex > > > > > Thanks, > > > > M. > > > > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c > > index e4727dc771bf..42f2ab80646c 100644 > > --- a/arch/arm64/kvm/arm.c > > +++ b/arch/arm64/kvm/arm.c > > @@ -1061,7 +1061,8 @@ int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level *irq_level, > > static int kvm_vcpu_set_target(struct kvm_vcpu *vcpu, > > const struct kvm_vcpu_init *init) > > { > > - unsigned int i, ret; > > + unsigned int i; > > + int ret = 0; > > u32 phys_target = kvm_target_cpu(); > > > > if (init->target != phys_target) > > @@ -1074,32 +1075,46 @@ static int kvm_vcpu_set_target(struct kvm_vcpu *vcpu, > > if (vcpu->arch.target != -1 && vcpu->arch.target != init->target) > > return -EINVAL; > > > > + /* Hazard against a concurent check of the target in kvm_reset_vcpu() */ > > + mutex_lock(&vcpu->kvm->lock); > > + > > /* -ENOENT for unknown features, -EINVAL for invalid combinations. */ > > for (i = 0; i < sizeof(init->features) * 8; i++) { > > bool set = (init->features[i / 32] & (1 << (i % 32))); > > > > - if (set && i >= KVM_VCPU_MAX_FEATURES) > > - return -ENOENT; > > + if (set && i >= KVM_VCPU_MAX_FEATURES) { > > + ret = -ENOENT; > > + break; > > + } > > > > /* > > * Secondary and subsequent calls to KVM_ARM_VCPU_INIT must > > * use the same feature set. > > */ > > if (vcpu->arch.target != -1 && i < KVM_VCPU_MAX_FEATURES && > > - test_bit(i, vcpu->arch.features) != set) > > - return -EINVAL; > > + test_bit(i, vcpu->arch.features) != set) { > > + ret = -EINVAL; > > + break; > > + } > > > > if (set) > > set_bit(i, vcpu->arch.features); > > } > > > > - vcpu->arch.target = phys_target; > > + if (!ret) > > + vcpu->arch.target = phys_target; > > + > > + mutex_unlock(&vcpu->kvm->lock); > > + if (ret) > > + return ret; > > > > /* Now we know what it is, we can reset it. */ > > ret = kvm_reset_vcpu(vcpu); > > if (ret) { > > + mutex_lock(&vcpu->kvm->lock); > > vcpu->arch.target = -1; > > bitmap_zero(vcpu->arch.features, KVM_VCPU_MAX_FEATURES); > > + mutex_unlock(&vcpu->kvm->lock); > > } > > > > return ret; > > diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c > > index ef78bbc7566a..fae88a703140 100644 > > --- a/arch/arm64/kvm/reset.c > > +++ b/arch/arm64/kvm/reset.c > > @@ -180,13 +180,6 @@ static bool vcpu_allowed_register_width(struct kvm_vcpu *vcpu) > > if (kvm_has_mte(vcpu->kvm) && is32bit) > > return false; > > > > - /* > > - * Make sure vcpu->arch.target setting is visible from others so > > - * that the width consistency checking between two vCPUs is done > > - * by at least one of them at KVM_ARM_VCPU_INIT. > > - */ > > - smp_mb(); > > - > > /* Check that the vcpus are either all 32bit or all 64bit */ > > kvm_for_each_vcpu(i, tmp, vcpu->kvm) { > > /* Skip if KVM_ARM_VCPU_INIT is not done for the vcpu yet */ > > @@ -222,14 +215,19 @@ static bool vcpu_allowed_register_width(struct kvm_vcpu *vcpu) > > int kvm_reset_vcpu(struct kvm_vcpu *vcpu) > > { > > struct vcpu_reset_state reset_state; > > - int ret; > > + int ret = -EINVAL; > > bool loaded; > > u32 pstate; > > > > mutex_lock(&vcpu->kvm->lock); > > - reset_state = vcpu->arch.reset_state; > > - WRITE_ONCE(vcpu->arch.reset_state.reset, false); > > + if (vcpu_allowed_register_width(vcpu)) { > > + reset_state = vcpu->arch.reset_state; > > + WRITE_ONCE(vcpu->arch.reset_state.reset, false); > > + ret = 0; > > + } > > mutex_unlock(&vcpu->kvm->lock); > > + if (ret) > > + goto out; > > > > /* Reset PMU outside of the non-preemptible section */ > > kvm_pmu_vcpu_reset(vcpu); > > @@ -257,11 +255,6 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu) > > } > > } > > > > - if (!vcpu_allowed_register_width(vcpu)) { > > - ret = -EINVAL; > > - goto out; > > - } > > - > > switch (vcpu->arch.target) { > > default: > > if (test_bit(KVM_ARM_VCPU_EL1_32BIT, vcpu->arch.features)) { > > > > -- > > Without deviation from the norm, progress is not possible.