On Thu, Feb 21, 2019 at 05:48:59PM +0000, Julien Thierry wrote: > Hi Dave, > > On 18/02/2019 19:52, Dave Martin wrote: > > This patch adds a new pseudo-register KVM_REG_ARM64_SVE_VLS to > > allow userspace to set and query the set of vector lengths visible > > to the guest, along with corresponding storage in struct > > kvm_vcpu_arch. > > > > In the future, multiple register slices per SVE register may be > > visible through the ioctl interface. Once the set of slices has > > been determined we would not be able to allow the vector length set > > to be changed any more, in order to avoid userspace seeing > > inconsistent sets of registers. For this reason, this patch adds > > support to track vcpu finalization explicitly, and enforce proper > > sequencing of ioctls. > > > > The new pseudo-register is not exposed yet. Subsequent patches > > will allow SVE to be turned on for guest vcpus, making it visible. > > > > Signed-off-by: Dave Martin <Dave.Martin@xxxxxxx> > > > > --- > > > > Changes since v4: > > > > * Add a UAPI header comment indicating the pseudo-register status of > > KVM_REG_ARM64_SVE_VLS. > > > > * Get rid of the sve_vqs[] array from struct kvm_vcpu_arch. This > > array is pointless, because its contents must match the host's > > internal sve_vq_map anyway, up to vcpu->arch.sve_max_vl. > > > > The ioctl register accessors are slow-path code, so we can decode > > or reconstruct sve_vqs[] on demand instead, for exchange with > > userspace. > > > > * For compatibility with potential future architecture extensions, > > enabling vector lengths above 256 bytes for the guest is explicitly > > disallowed now (because the code for exposing additional bits > > through ioctl is currently missing). This can be addressed later > > if/when needed. > > > > Note: > > > > * I defensively pass an array by pointer here, to help avoid > > accidentally breaking assumptions during future maintenance. > > > > Due to (over?)zealous constification, this causes the following > > sparse warning. I think this is sparse getting confused: I am not > > relying on any kernel-specific semantics here, and GCC generates no > > warning. > > > > +arch/arm64/kvm/guest.c:33: warning: incorrect type in argument 1 (different modifiers) > > +arch/arm64/kvm/guest.c:33: expected unsigned long long const [usertype] ( *const vqs )[8] > > +arch/arm64/kvm/guest.c:33: got unsigned long long [usertype] ( * )[8] > > > > --- [...] > > diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h [...] > > +static int get_sve_vls(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) > > +{ > > + unsigned int max_vq, vq; > > + u64 vqs[DIV_ROUND_UP(SVE_VQ_MAX - SVE_VQ_MIN + 1, 64)]; > > + > > + if (WARN_ON(!sve_vl_valid(vcpu->arch.sve_max_vl))) > > + return -EINVAL; > > + > > + memset(vqs, 0, sizeof(vqs)); > > + > > + max_vq = sve_vq_from_vl(vcpu->arch.sve_max_vl); > > + for (vq = SVE_VQ_MIN; vq <= max_vq; ++vq) > > + if (sve_vq_available(vq)) > > + vqs[vq_word(vq)] |= vq_mask(vq); > > + > > + BUILD_BUG_ON(sizeof(vqs) != KVM_REG_SIZE(reg->id)); > > reg->id is not know at build time. From my understanding of > BUILD_BUG_ON(), things actually ends up evaluated at runtime but I'm not > sure what happens when doing sizeof(char[1- 2*0]) at runtime. > > Anyway, I don't think this is intended. There's no runtime check: BUILD_BUG_ON() will cause compilation to fail if the required condition doesn't fall out from optimisation. Because of the way this function is called, reg->id is always KVM_REG_ARM64_SVE_VLS, so inlining and constant propagation will make this check pass (and compile to nothing). We can assume a certain amount of inlining: the kernel officially can't be built without optimisation. But the precise build configuration can sometimes have an effect here -- so it may not be better to rely on this working for this slow-path code. I'll convert these to if (WARN_ON()) return -EIO or something, or drop them if I can get rid of them in other refactoring. The fact that struct kvm_one_reg lacks any "size" field is rather unfortunate, since it provides no explicit way for userspace to tell us the size of its buffer. > > + if (copy_to_user((void __user *)reg->addr, vqs, sizeof(vqs))) > > + return -EFAULT; > > + > > + return 0; > > +} > > + > > +static int set_sve_vls(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) > > +{ > > + unsigned int max_vq, vq; > > + u64 vqs[DIV_ROUND_UP(SVE_VQ_MAX - SVE_VQ_MIN + 1, 64)]; > > + > > + if (kvm_arm_vcpu_finalized(vcpu)) > > + return -EPERM; /* too late! */ > > + > > + if (WARN_ON(vcpu->arch.sve_state)) > > + return -EINVAL; > > + > > + BUILD_BUG_ON(sizeof(vqs) != KVM_REG_SIZE(reg->id)); > > Same as above. > > > + if (copy_from_user(vqs, (const void __user *)reg->addr, sizeof(vqs))) > > + return -EFAULT; > > + > > + max_vq = 0; > > + for (vq = SVE_VQ_MIN; vq <= SVE_VQ_MAX; ++vq) > > + if (vq_present(&vqs, vq)) > > + max_vq = vq; > > + > > + /* > > + * The get_sve_reg()/set_sve_reg() ioctl interface will need > > + * to be extended with multiple register slice support in > > + * order to support vector lengths greater than > > + * SVE_VL_ARCH_MAX: > > + */ > > + if (WARN_ONCE(sve_vl_from_vq(max_vq) > SVE_VL_ARCH_MAX, > > + "KVM: Requested vector length not supported yet\n")) > > + return -EINVAL; > > + > > + for (vq = SVE_VQ_MIN; vq <= max_vq; ++vq) > > + if (vq_present(&vqs, vq) != sve_vq_available(vq)) > > + return -EINVAL; > > + > > + /* Can't run with no vector lengths at all: */ > > + if (max_vq < SVE_VQ_MIN) > > + return -EINVAL; > > + > > + vcpu->arch.sve_max_vl = sve_vl_from_vq(max_vq); > > + > > + return 0; > > +} > > + > > #define SVE_REG_SLICE_SHIFT 0 > > #define SVE_REG_SLICE_BITS 5 > > #define SVE_REG_ID_SHIFT (SVE_REG_SLICE_SHIFT + SVE_REG_SLICE_BITS) > > @@ -297,9 +372,21 @@ static int sve_reg_region(struct sve_state_region *region, > > static int get_sve_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) > > { > > struct sve_state_region region; > > + int ret; > > char __user *uptr = (char __user *)reg->addr; > > > > - if (!vcpu_has_sve(vcpu) || sve_reg_region(®ion, vcpu, reg)) > > + if (!vcpu_has_sve(vcpu)) > > + return -ENOENT; > > + > > + if (reg->id == KVM_REG_ARM64_SVE_VLS) > > + return get_sve_vls(vcpu, reg); > > + > > + /* Finalize the number of slices per SVE register: */ > > + ret = kvm_arm_vcpu_finalize(vcpu); > > Having this here feels a bit random... I'll have another think about this. KVM_REG_ARM64_SVE_VLS certainly has nothing to do with anything except SVE, and putting it in the SVE coproc space allows us to demux SVE- related stuff from everything else neatly. The whole "coprocessor" concept is nonsense for arm64 anyway except for the sysregs (where the reg ID encodings map directly onto the architecture and also align with AArch32). So, the copro number is pretty meaningless except as a demux key for related groups of registers, which is how I use it here. There's also no well-defined distinction between real and fake registers, and no natural home for miscellaneous fakes -- all existing blocks are for specific purposes such as recording the state of emulated firmware APIs etc. The SVE registers as exposed here already have no direct equivalent in the architecture either, due to the padding up to 2048 bits, and the presentation of FFR as addressable P-register etc. > I'd suggest considering the pseudo-register outside of the SVE co-proc, > as part of a set of registers that do not finalize a vcpu when accessed. (Also, the dependency is not one-way: _SVE_VLS is not independent of finalization: quite the reverse -- it can only be written _before_ finalization.) > All other registers (even non-sve ones) would finalize the vcpu when > accessed from userland. So, we could consider register to be in two classes as you suggest: * early registers * regular registers (i.e., everything else) where early registers can only be written until the first access to a regular register, or the first KVM_GET_REG_LIST or KVM_RUN. This is a stricter rule than I have today: i.e., _SVE_VLS would really need to be written before touching anything else. But that may be a good thing for avoiding future ABI drift. This may be premature factoring though. It's not clear whether many new registers like KVM_REG_ARM64_SVE_VLS will exist, or that they would have compatible sequencing requirements. Thoughts? Independently of this, I'll have a look at whether there's a straightforward way to simplify the code flow. Cheers ---Dave _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/mailman/listinfo/kvmarm