Hi Marc, On 7/19/21 5:56 PM, Marc Zyngier wrote: > Hi Alex, > > On 2021-07-19 17:35, Alexandru Elisei wrote: >> Hi Marc, >> >> On 7/19/21 1:39 PM, Marc Zyngier wrote: >>> We keep an entry for the PMSWINC_EL0 register in the vcpu structure, >>> while *never* writing anything there outside of reset. >>> >>> Given that the register is defined as write-only, that we always >>> trap when this register is accessed, there is little point in saving >>> anything anyway. >>> >>> Get rid of the entry, and save a mighty 8 bytes per vcpu structure. >>> >>> We still need to keep it exposed to userspace in order to preserve >>> backward compatibility with previously saved VMs. Since userspace >>> cannot expect any effect of writing to PMSWINC_EL0, treat the >>> register as RAZ/WI for the purpose of userspace access. >>> >>> Signed-off-by: Marc Zyngier <maz@xxxxxxxxxx> >>> --- >>> arch/arm64/include/asm/kvm_host.h | 1 - >>> arch/arm64/kvm/sys_regs.c | 21 ++++++++++++++++++++- >>> 2 files changed, 20 insertions(+), 2 deletions(-) >>> >>> diff --git a/arch/arm64/include/asm/kvm_host.h >>> b/arch/arm64/include/asm/kvm_host.h >>> index 41911585ae0c..afc169630884 100644 >>> --- a/arch/arm64/include/asm/kvm_host.h >>> +++ b/arch/arm64/include/asm/kvm_host.h >>> @@ -185,7 +185,6 @@ enum vcpu_sysreg { >>> PMCNTENSET_EL0, /* Count Enable Set Register */ >>> PMINTENSET_EL1, /* Interrupt Enable Set Register */ >>> PMOVSSET_EL0, /* Overflow Flag Status Set Register */ >>> - PMSWINC_EL0, /* Software Increment Register */ >>> PMUSERENR_EL0, /* User Enable Register */ >>> >>> /* Pointer Authentication Registers in a strict increasing order. */ >>> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c >>> index f22139658e48..a1f5101f49a3 100644 >>> --- a/arch/arm64/kvm/sys_regs.c >>> +++ b/arch/arm64/kvm/sys_regs.c >>> @@ -1286,6 +1286,20 @@ static int set_raz_id_reg(struct kvm_vcpu *vcpu, const >>> struct sys_reg_desc *rd, >>> return __set_id_reg(vcpu, rd, uaddr, true); >>> } >>> >>> +static int set_wi_reg(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd, >>> + const struct kvm_one_reg *reg, void __user *uaddr) >>> +{ >>> + int err; >>> + u64 val; >>> + >>> + /* Perform the access even if we are going to ignore the value */ >>> + err = reg_from_user(&val, uaddr, sys_reg_to_index(rd)); >> >> I don't understand why the read still happens if the value is ignored. >> Just so KVM >> preserves the previous behaviour and tells userspace there was an error? > > If userspace has given us a duff pointer, it needs to know about it. Makes sense, thanks. > >>> + if (err) >>> + return err; >>> + >>> + return 0; >>> +} >>> + >>> static bool access_ctr(struct kvm_vcpu *vcpu, struct sys_reg_params *p, >>> const struct sys_reg_desc *r) >>> { >>> @@ -1629,8 +1643,13 @@ static const struct sys_reg_desc sys_reg_descs[] = { >>> .access = access_pmcnten, .reg = PMCNTENSET_EL0 }, >>> { PMU_SYS_REG(SYS_PMOVSCLR_EL0), >>> .access = access_pmovs, .reg = PMOVSSET_EL0 }, >>> + /* >>> + * PM_SWINC_EL0 is exposed to userspace as RAZ/WI, as it was >>> + * previously (and pointlessly) advertised in the past... >>> + */ >>> { PMU_SYS_REG(SYS_PMSWINC_EL0), >>> - .access = access_pmswinc, .reg = PMSWINC_EL0 }, >>> + .get_user = get_raz_id_reg, .set_user = set_wi_reg, >> >> In my opinion, the call chain to return 0 looks pretty confusing to me, as the >> functions seemed made for ID register accesses, and the leaf function, >> read_id_reg(), tries to match this register with a list of ID >> registers. Since we >> have already added a new function just for PMSWINC_EL0, I was >> wondering if adding >> another function, something like get_raz_reg(), would make more sense. > > In that case, I'd rather just kill get_raz_id_reg() and replace it with > this get_raz_reg(). If we trat something as RAZ, who cares whether it is > an idreg or not? I agree, the Arm ARM doesn't make the distinction between ID registers and other system registers in the definition of RAZ, I don't think KVM should either. And the way read_id_reg() is written allows returning a value different than 0 even if raz is true, which in my opinion could only happen because of a bug in KVM. I can have a go at writing the patch(es) on top of this series, if you want. At the moment I'm rewriting the KVM SPE series, so it will be a few weeks until I get around to doing it though. Thanks, Alex