On Tue, 20 Jul 2021 17:44:32 +0100, Alexandru Elisei <alexandru.elisei@xxxxxxx> wrote: > > Hi Marc, > > On 7/19/21 5:56 PM, Marc Zyngier wrote: > > Hi Alex, > > > > On 2021-07-19 17:35, Alexandru Elisei wrote: > >> Hi Marc, > >> > >> On 7/19/21 1:39 PM, Marc Zyngier wrote: > >>> We keep an entry for the PMSWINC_EL0 register in the vcpu structure, > >>> while *never* writing anything there outside of reset. > >>> > >>> Given that the register is defined as write-only, that we always > >>> trap when this register is accessed, there is little point in saving > >>> anything anyway. > >>> > >>> Get rid of the entry, and save a mighty 8 bytes per vcpu structure. > >>> > >>> We still need to keep it exposed to userspace in order to preserve > >>> backward compatibility with previously saved VMs. Since userspace > >>> cannot expect any effect of writing to PMSWINC_EL0, treat the > >>> register as RAZ/WI for the purpose of userspace access. > >>> > >>> Signed-off-by: Marc Zyngier <maz@xxxxxxxxxx> > >>> --- > >>> arch/arm64/include/asm/kvm_host.h | 1 - > >>> arch/arm64/kvm/sys_regs.c | 21 ++++++++++++++++++++- > >>> 2 files changed, 20 insertions(+), 2 deletions(-) > >>> > >>> diff --git a/arch/arm64/include/asm/kvm_host.h > >>> b/arch/arm64/include/asm/kvm_host.h > >>> index 41911585ae0c..afc169630884 100644 > >>> --- a/arch/arm64/include/asm/kvm_host.h > >>> +++ b/arch/arm64/include/asm/kvm_host.h > >>> @@ -185,7 +185,6 @@ enum vcpu_sysreg { > >>> PMCNTENSET_EL0, /* Count Enable Set Register */ > >>> PMINTENSET_EL1, /* Interrupt Enable Set Register */ > >>> PMOVSSET_EL0, /* Overflow Flag Status Set Register */ > >>> - PMSWINC_EL0, /* Software Increment Register */ > >>> PMUSERENR_EL0, /* User Enable Register */ > >>> > >>> /* Pointer Authentication Registers in a strict increasing order. */ > >>> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c > >>> index f22139658e48..a1f5101f49a3 100644 > >>> --- a/arch/arm64/kvm/sys_regs.c > >>> +++ b/arch/arm64/kvm/sys_regs.c > >>> @@ -1286,6 +1286,20 @@ static int set_raz_id_reg(struct kvm_vcpu *vcpu, const > >>> struct sys_reg_desc *rd, > >>> return __set_id_reg(vcpu, rd, uaddr, true); > >>> } > >>> > >>> +static int set_wi_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd, > >>> + const struct kvm_one_reg *reg, void __user *uaddr) > >>> +{ > >>> + int err; > >>> + u64 val; > >>> + > >>> + /* Perform the access even if we are going to ignore the value */ > >>> + err = reg_from_user(&val, uaddr, sys_reg_to_index(rd)); > >> > >> I don't understand why the read still happens if the value is ignored. > >> Just so KVM > >> preserves the previous behaviour and tells userspace there was an error? > > > > If userspace has given us a duff pointer, it needs to know about it. > > Makes sense, thanks. > > > > >>> + if (err) > >>> + return err; > >>> + > >>> + return 0; > >>> +} > >>> + > >>> static bool access_ctr(struct kvm_vcpu *vcpu, struct sys_reg_params *p, > >>> const struct sys_reg_desc *r) > >>> { > >>> @@ -1629,8 +1643,13 @@ static const struct sys_reg_desc sys_reg_descs[] = { > >>> .access = access_pmcnten, .reg = PMCNTENSET_EL0 }, > >>> { PMU_SYS_REG(SYS_PMOVSCLR_EL0), > >>> .access = access_pmovs, .reg = PMOVSSET_EL0 }, > >>> + /* > >>> + * PM_SWINC_EL0 is exposed to userspace as RAZ/WI, as it was > >>> + * previously (and pointlessly) advertised in the past... > >>> + */ > >>> { PMU_SYS_REG(SYS_PMSWINC_EL0), > >>> - .access = access_pmswinc, .reg = PMSWINC_EL0 }, > >>> + .get_user = get_raz_id_reg, .set_user = set_wi_reg, > >> > >> In my opinion, the call chain to return 0 looks pretty confusing to me, as the > >> functions seemed made for ID register accesses, and the leaf function, > >> read_id_reg(), tries to match this register with a list of ID > >> registers. Since we > >> have already added a new function just for PMSWINC_EL0, I was > >> wondering if adding > >> another function, something like get_raz_reg(), would make more sense. > > > > In that case, I'd rather just kill get_raz_id_reg() and replace it with > > this get_raz_reg(). If we trat something as RAZ, who cares whether it is > > an idreg or not? > > I agree, the Arm ARM doesn't make the distinction between ID > registers and other system registers in the definition of RAZ, I > don't think KVM should either. And the way read_id_reg() is written > allows returning a value different than 0 even if raz is true, which > in my opinion could only happen because of a bug in KVM. > > I can have a go at writing the patch(es) on top of this series, if > you want. At the moment I'm rewriting the KVM SPE series, so it will > be a few weeks until I get around to doing it though. We can do that at any time, it is just a cleanup without any guest or userspace visible effect. If a get a spare hour, I'll have a look. Otherwise, this can wait a bit. Thanks, M. -- Without deviation from the norm, progress is not possible.