Excerpts from Fabiano Rosas's message of February 27, 2021 8:37 am: > Nicholas Piggin <npiggin@xxxxxxxxx> writes: > > Hi, thanks for this. It helped me clarify a bunch of details that I > haven't understood while reading the asm code. Yes, me too :) >> +/* >> + * void kvmppc_p9_enter_guest(struct vcpu *vcpu); >> + * >> + * Enter the guest on a ISAv3.0 or later system where we have exactly >> + * one vcpu per vcore, and both the host and guest are radix, and threads >> + * are set to "indepdent mode". >> + */ >> +.balign IFETCH_ALIGN_BYTES >> +_GLOBAL(kvmppc_p9_enter_guest) >> +EXPORT_SYMBOL_GPL(kvmppc_p9_enter_guest) >> + mflr r0 >> + std r0,PPC_LR_STKOFF(r1) >> + stdu r1,-SFS(r1) >> + >> + std r1,HSTATE_HOST_R1(r13) >> + std r3,STACK_SLOT_VCPU(r1) > > The vcpu was stored on the paca previously. I don't get the change, > could you explain? The stack is a nicer place to keep things. We only need one place to save the stack, then most things can come from there. There's actually more paca stuff we could move into local variables or onto the stack too. It was probably done like this because PR KVM which probably can't access the stack in real mode when running in an LPAR, and came across to the HV code that way. New path doesn't require it. >> +kvmppc_p9_exit_interrupt: >> + std r1,HSTATE_SCRATCH1(r13) >> + std r3,HSTATE_SCRATCH2(r13) >> + ld r1,HSTATE_HOST_R1(r13) >> + ld r3,STACK_SLOT_VCPU(r1) >> + >> + std r9,VCPU_CR(r3) >> + >> +1: >> + std r11,VCPU_PC(r3) >> + std r12,VCPU_MSR(r3) >> + >> + reg = 14 >> + .rept 18 >> + std reg,__VCPU_GPR(reg)(r3) >> + reg = reg + 1 >> + .endr >> + >> + /* r1, r3, r9-r13 are saved to vcpu by C code */ > > If we just saved r1 and r3, why don't we put them in the vcpu right now? > That would avoid having the C code knowing about scratch areas. Putting it in C avoids having the asm code know about scratch areas. >> @@ -4429,11 +4432,19 @@ static int kvmppc_vcpu_run_hv(struct kvm_vcpu *vcpu) >> else >> r = kvmppc_run_vcpu(vcpu); >> >> - if (run->exit_reason == KVM_EXIT_PAPR_HCALL && >> - !(vcpu->arch.shregs.msr & MSR_PR)) { >> - trace_kvm_hcall_enter(vcpu); >> - r = kvmppc_pseries_do_hcall(vcpu); >> - trace_kvm_hcall_exit(vcpu, r); >> + if (run->exit_reason == KVM_EXIT_PAPR_HCALL) { >> + if (unlikely(vcpu->arch.shregs.msr & MSR_PR)) { >> + /* >> + * Guest userspace executed sc 1, reflect it >> + * back as a privileged program check interrupt. >> + */ >> + kvmppc_core_queue_program(vcpu, SRR1_PROGPRIV); >> + r = RESUME_GUEST; > > This is in conflict with this snippet in kvmppc_handle_exit_hv: > > case BOOK3S_INTERRUPT_SYSCALL: > { > /* hcall - punt to userspace */ > int i; > > /* hypercall with MSR_PR has already been handled in rmode, > * and never reaches here. > */ > > That function already queues some 0x700s so maybe we could move this one > in there as well. I don't think it conflicts, but I think perhaps it should go in the patch which removed the real mode handlers. kvmppc_handle_exit_hv is used by both HV paths so for now it's a bit neater to try get things into the same state, but we could do this in a later patch perhaps. >> + * >> + * To work around this we put a canary value into the HDSISR before >> + * returning to a guest and then check for this canary when we take a >> + * HDSI. If we find the canary on a HDSI, we know the hardware didn't >> + * update the HDSISR. In this case we return to the guest to retake the >> + * HDSI which should correctly update the HDSISR the second time HDSI >> + * entry. >> + * >> + * Just do this on all p9 processors for now. >> + */ >> + mtspr(SPRN_HDSISR, HDSISR_CANARY); >> + >> + accumulate_time(vcpu, &vcpu->arch.guest_time); >> + >> + local_paca->kvm_hstate.in_guest = KVM_GUEST_MODE_GUEST_HV_FAST; >> + kvmppc_p9_enter_guest(vcpu); >> + // Radix host and guest means host never runs with guest MMU state >> + local_paca->kvm_hstate.in_guest = KVM_GUEST_MODE_NONE; >> + >> + accumulate_time(vcpu, &vcpu->arch.rm_intr); >> + >> + /* Get these from r11/12 and paca exsave */ >> + vcpu->arch.shregs.srr0 = mfspr(SPRN_SRR0); >> + vcpu->arch.shregs.srr1 = mfspr(SPRN_SRR1); >> + vcpu->arch.shregs.dar = mfspr(SPRN_DAR); >> + vcpu->arch.shregs.dsisr = mfspr(SPRN_DSISR); >> + >> + trap = local_paca->kvm_hstate.scratch0 & ~0x2; > > Couldn't we return the trap from kvmppc_p9_enter_guest? Seems like a > nice pattern to have and avoids exposing the IVEC+0x2 magic which is hidden > in asm already. I use the 0x2 test in C in a later patch. The IVEC+0x2 thing might just go away entirely though for new HV path, but we'd still clear it in C on the principle of minimal asm code. >> + radix_clear_slb(); >> + >> + __mtmsrd(msr, 0); > > Isn't this the same as mtmsr(msr) ? Yes. For 64s you can use the __ variant though. We use it in other places in this function with L=1 so it doesn't make sense to mix variants. Thanks, Nick > >> + >> + accumulate_time(vcpu, &vcpu->arch.rm_exit); >> + >> + end_timing(vcpu); >> + >> + return trap; >> +} >> +EXPORT_SYMBOL_GPL(__kvmhv_vcpu_entry_p9); > > <snip> >