Nicholas Piggin <npiggin@xxxxxxxxx> writes: <snip> >> static void restore_hv_regs(struct kvm_vcpu *vcpu, struct hv_guest_state *hr) >> @@ -324,9 +340,10 @@ long kvmhv_enter_nested_guest(struct kvm_vcpu *vcpu) >> mask = LPCR_DPFD | LPCR_ILE | LPCR_TC | LPCR_AIL | LPCR_LD | >> LPCR_LPES | LPCR_MER; >> lpcr = (vc->lpcr & ~mask) | (l2_hv.lpcr & mask); >> - sanitise_hv_regs(vcpu, &l2_hv); >> restore_hv_regs(vcpu, &l2_hv); >> >> + sanitise_vcpu_entry_state(vcpu, &l2_hv, &saved_l1_hv); > > So instead of doing this, can we just have one function that does > load_hv_regs_for_l2()? Yes. I would go even further and fold everything into a load_l2_state() that takes care of hv and non-hv. The top level here could easily be: save_l1_state(); load_l2_state(); do { kvmhv_run_single_vcpu(); } while(); save_l2_state(); restore_l1_state(); I'll send a v3 with the change you suggested and then perhaps a small refactoring on top of it. Let's see how it turns out. > >> + >> vcpu->arch.ret = RESUME_GUEST; >> vcpu->arch.trap = 0; >> do { >> @@ -338,6 +355,8 @@ long kvmhv_enter_nested_guest(struct kvm_vcpu *vcpu) >> r = kvmhv_run_single_vcpu(vcpu, hdec_exp, lpcr); >> } while (is_kvmppc_resume_guest(r)); >> >> + sanitise_vcpu_return_state(vcpu, &l2_hv); > > And this could be done in save_hv_return_state(). > > I think? > > Question about HFSCR. Is it possible for some interrupt cause bit > reaching the nested hypervisor for a bit that we thought we had > enabled but was secretly masked off? I.e., do we have to filter > HFSCR causes according to the facilities we secretly disabled? Yes, we're copying the Cause bits unmodified. Currently it makes no difference because L1 only checks for doorbells and everything else leads to injecting a program interrupt into L2. What I think is the correct thing to do is to only return into L1 with the Cause bits pertaining to the facilities it has disabled (if L1 state has a bit set but L2 state has not). For the facilities L0 has disabled in L1, we should handle them as if L1 had tried to use the facility and instead of returning from H_ENTER_NESTED into L1, do whatever we currently do under BOOK3S_INTERRUPT_H_FAC_UNAVAIL for non-nested guests. Which would eventually mean injecting a program interrupt into L1 because we're not L2s hypervisor - L1 is - so there is not much we'd want to do in L0 in terms of emulating the facility. Does that make sense? > > Thanks, > Nick