> > > + assign_clear_set_bit(tpt, CNTHCTL_EL1PCEN << 10, set, clr); > > > + assign_clear_set_bit(tpc, CNTHCTL_EL1PCTEN << 10, set, clr); > > > > Nit: IMHO the way the code specifies the 'set' and 'clr' arguments for > > the macro might be a bit confusing ('set' is for '_clr', and 'clr' is > > for '_set')? > > I don't disagree, but we end-up with bits of different polarity once > NV is fully in, see: > > https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/tree/arch/arm64/kvm/arch_timer.c?h=kvm-arm64/nv-6.4-WIP#n861 > > > Perhaps changing the parameter names of assign_clear_set_bit() like > > below or flipping the condition (i.e. Specify !tpt or no_tpt instead > > of tpt) might be less confusing? > > > > #define assign_clear_set_bit(_pred, _bit, _t_val, _f_val) \ > > do { \ > > if (_pred) \ > > (_t_val) |= (_bit); \ > > else \ > > (_f_val) |= (_bit); \ > > } while (0) > > See the pointer above. We need a good way to specify bits that have > one polarity or another, and compute the result given the high-level > constraints that are provided by the emulation code. > > So far, I haven't been able to work out something "nice". Thank you for the pointer. Understood. I don't have a batter idea though :) > > [...] > > > > + { SYS_DESC(SYS_CNTPCT_EL0), access_arch_timer }, > > > + { SYS_DESC(SYS_CNTPCTSS_EL0), access_arch_timer }, > > > { SYS_DESC(SYS_CNTP_TVAL_EL0), access_arch_timer }, > > > { SYS_DESC(SYS_CNTP_CTL_EL0), access_arch_timer }, > > > { SYS_DESC(SYS_CNTP_CVAL_EL0), access_arch_timer }, > > > @@ -2525,6 +2533,7 @@ static const struct sys_reg_desc cp15_64_regs[] = { > > > { Op1( 0), CRn( 0), CRm( 2), Op2( 0), access_vm_reg, NULL, TTBR0_EL1 }, > > > { CP15_PMU_SYS_REG(DIRECT, 0, 0, 9, 0), .access = access_pmu_evcntr }, > > > { Op1( 0), CRn( 0), CRm(12), Op2( 0), access_gic_sgi }, /* ICC_SGI1R */ > > > + { SYS_DESC(SYS_AARCH32_CNTPCT), access_arch_timer }, > > > > Shouldn't KVM also emulate CNTPCTSS (Aarch32) when its trapping is > > enabled on the host with ECV_CNTPOFF ? > > Oh, well spotted. I'll queue something on top of the series to that > effect (I'd rather not respin it as it has been in -next for some > time, and the merge window is approaching). Understood. > > > > > > > > { Op1( 1), CRn( 0), CRm( 2), Op2( 0), access_vm_reg, NULL, TTBR1_EL1 }, > > > { Op1( 1), CRn( 0), CRm(12), Op2( 0), access_gic_sgi }, /* ICC_ASGI1R */ > > > { Op1( 2), CRn( 0), CRm(12), Op2( 0), access_gic_sgi }, /* ICC_SGI0R */ > > > -- > > > 2.34.1 > > > > > > > > > > Nit: In emulating reading physical counter/timer for direct_ptimer > > (poffset != 0 on VHE without ECV_CNTPOFF), it appears that > > kvm_arm_timer_read_sysreg() unnecessarily calls > > timer_{save,restore}_state(), and kvm_arm_timer_write_sysreg() > > unnecessarily calls timer_save_state(). Couldn't we skip those > > unnecessary calls ? (I didn't check all the following patches, but > > the current code would be more convenient in the following patches ?) > > Well, it depends how you look at it. We still perform "some" level of > emulation (such as offsetting CVAL), and it allows us to share some > code with the full emulation. Even if we skip calling timer_save_state() and/or timer_restore_state(), I would think we could still share some code with the full emulation. What I meant was something like below (a diff from the patch-5). Or Am I misunderstanding something? --- a/arch/arm64/kvm/arch_timer.c +++ b/arch/arm64/kvm/arch_timer.c @@ -1003,7 +1003,7 @@ u64 kvm_arm_timer_read_sysreg(struct kvm_vcpu *vcpu, get_timer_map(vcpu, &map); timer = vcpu_get_timer(vcpu, tmr); - if (timer == map.emul_ptimer) + if (timer == map.emul_ptimer || timer == map.direct_ptimer) return kvm_arm_timer_read(vcpu, timer, treg); preempt_disable(); @@ -1056,7 +1056,9 @@ void kvm_arm_timer_write_sysreg(struct kvm_vcpu *vcpu, timer_emulate(timer); } else { preempt_disable(); - timer_save_state(timer); + if (timer != map.direct_ptimer) + timer_save_state(timer); + kvm_arm_timer_write(vcpu, timer, treg, val); timer_restore_state(timer); preempt_enable(); > On top of that, we already fast-track CNTPCT_EL0, which is the main > user, and has a visible benefit with NV. If anything, I'd rather add a > similar fast-tracking for the read side of CNTP_CVAL_EL0 and > CNTP_CTL_EL0. We could then leave that code for 32bit only, which > nobody gives a toss about. > > What do you think? Ah, I didn't check that fast-track CNTPCT_EL0. Yes, that would be nicer. Thank you! Reiji