On Wed May 22, 2024 at 6:49 PM AEST, Gautam Menghani wrote: > Doorbell emulation is broken for KVM on PowerVM guests as support for > DPDES was not added in the initial patch series. Due to this, a KVM on > PowerVM guest cannot be booted with the XICS interrupt controller as > doorbells are to be setup in the initial probe path when using XICS > (pSeries_smp_probe()). Add DPDES support in the host KVM code to fix > doorbell emulation. This is broken when the KVM guest has SMT > 1? Or is it broken for SMT=1 as well? Can you explain a bit more of what breaks if it's the latter? > Fixes: 6ccbbc33f06a ("KVM: PPC: Add helper library for Guest State Buffers") > Cc: stable@xxxxxxxxxxxxxxx > Signed-off-by: Gautam Menghani <gautam@xxxxxxxxxxxxx> > --- > v1 -> v1 resend: > 1. Add the stable tag > > Documentation/arch/powerpc/kvm-nested.rst | 4 +++- > arch/powerpc/include/asm/guest-state-buffer.h | 3 ++- > arch/powerpc/include/asm/kvm_book3s.h | 1 + > arch/powerpc/kvm/book3s_hv.c | 14 +++++++++++++- > arch/powerpc/kvm/book3s_hv_nestedv2.c | 7 +++++++ > arch/powerpc/kvm/test-guest-state-buffer.c | 2 +- > 6 files changed, 27 insertions(+), 4 deletions(-) > > diff --git a/Documentation/arch/powerpc/kvm-nested.rst b/Documentation/arch/powerpc/kvm-nested.rst > index 630602a8aa00..5defd13cc6c1 100644 > --- a/Documentation/arch/powerpc/kvm-nested.rst > +++ b/Documentation/arch/powerpc/kvm-nested.rst > @@ -546,7 +546,9 @@ table information. > +--------+-------+----+--------+----------------------------------+ > | 0x1052 | 0x08 | RW | T | CTRL | > +--------+-------+----+--------+----------------------------------+ > -| 0x1053-| | | | Reserved | > +| 0x1053 | 0x08 | RW | T | DPDES | > ++--------+-------+----+--------+----------------------------------+ > +| 0x1054-| | | | Reserved | > | 0x1FFF | | | | | > +--------+-------+----+--------+----------------------------------+ > | 0x2000 | 0x04 | RW | T | CR | > diff --git a/arch/powerpc/include/asm/guest-state-buffer.h b/arch/powerpc/include/asm/guest-state-buffer.h > index 808149f31576..d107abe1468f 100644 > --- a/arch/powerpc/include/asm/guest-state-buffer.h > +++ b/arch/powerpc/include/asm/guest-state-buffer.h > @@ -81,6 +81,7 @@ > #define KVMPPC_GSID_HASHKEYR 0x1050 > #define KVMPPC_GSID_HASHPKEYR 0x1051 > #define KVMPPC_GSID_CTRL 0x1052 > +#define KVMPPC_GSID_DPDES 0x1053 > > #define KVMPPC_GSID_CR 0x2000 > #define KVMPPC_GSID_PIDR 0x2001 > @@ -110,7 +111,7 @@ > #define KVMPPC_GSE_META_COUNT (KVMPPC_GSE_META_END - KVMPPC_GSE_META_START + 1) > > #define KVMPPC_GSE_DW_REGS_START KVMPPC_GSID_GPR(0) > -#define KVMPPC_GSE_DW_REGS_END KVMPPC_GSID_CTRL > +#define KVMPPC_GSE_DW_REGS_END KVMPPC_GSID_DPDES > #define KVMPPC_GSE_DW_REGS_COUNT \ > (KVMPPC_GSE_DW_REGS_END - KVMPPC_GSE_DW_REGS_START + 1) > > diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h > index 3e1e2a698c9e..10618622d7ef 100644 > --- a/arch/powerpc/include/asm/kvm_book3s.h > +++ b/arch/powerpc/include/asm/kvm_book3s.h > @@ -594,6 +594,7 @@ static inline u##size kvmppc_get_##reg(struct kvm_vcpu *vcpu) \ > > > KVMPPC_BOOK3S_VCORE_ACCESSOR(vtb, 64, KVMPPC_GSID_VTB) > +KVMPPC_BOOK3S_VCORE_ACCESSOR(dpdes, 64, KVMPPC_GSID_DPDES) > KVMPPC_BOOK3S_VCORE_ACCESSOR_GET(arch_compat, 32, KVMPPC_GSID_LOGICAL_PVR) > KVMPPC_BOOK3S_VCORE_ACCESSOR_GET(lpcr, 64, KVMPPC_GSID_LPCR) > KVMPPC_BOOK3S_VCORE_ACCESSOR_SET(tb_offset, 64, KVMPPC_GSID_TB_OFFSET) > diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c > index 35cb014a0c51..cf285e5153ba 100644 > --- a/arch/powerpc/kvm/book3s_hv.c > +++ b/arch/powerpc/kvm/book3s_hv.c > @@ -4116,6 +4116,11 @@ static int kvmhv_vcpu_entry_nestedv2(struct kvm_vcpu *vcpu, u64 time_limit, > int trap; > long rc; > > + if (vcpu->arch.doorbell_request) { > + vcpu->arch.doorbell_request = 0; > + kvmppc_set_dpdes(vcpu, 1); > + } This probably looks okay... hmm, is the v1 KVM emulating doorbells correctly for SMT L2 guests? I wonder if doorbell emulation isn't broken there too because the L1 code looks to be passing in vc->dpdes but all the POWER9 emulation code uses doorbell_request. > + > io = &vcpu->arch.nestedv2_io; > > msr = mfmsr(); > @@ -4278,9 +4283,16 @@ static int kvmhv_p9_guest_entry(struct kvm_vcpu *vcpu, u64 time_limit, > if (kvmhv_on_pseries()) { > if (kvmhv_is_nestedv1()) > trap = kvmhv_vcpu_entry_p9_nested(vcpu, time_limit, lpcr, tb); > - else > + else { > trap = kvmhv_vcpu_entry_nestedv2(vcpu, time_limit, lpcr, tb); > > + /* Remember doorbell if it is pending */ > + if (kvmppc_get_dpdes(vcpu)) { > + vcpu->arch.doorbell_request = 1; > + kvmppc_set_dpdes(vcpu, 0); > + } This is adding an extra get state for every entry, not good. I don't think it's actually needed though. I don't think the L1 cares at this stage what the L2 DPDES state is. So you sholud be able to drop this hunk. > + } > + > /* H_CEDE has to be handled now, not later */ > if (trap == BOOK3S_INTERRUPT_SYSCALL && !nested && > kvmppc_get_gpr(vcpu, 3) == H_CEDE) { > diff --git a/arch/powerpc/kvm/book3s_hv_nestedv2.c b/arch/powerpc/kvm/book3s_hv_nestedv2.c > index 8e6f5355f08b..36863fff2a99 100644 > --- a/arch/powerpc/kvm/book3s_hv_nestedv2.c > +++ b/arch/powerpc/kvm/book3s_hv_nestedv2.c > @@ -311,6 +311,10 @@ static int gs_msg_ops_vcpu_fill_info(struct kvmppc_gs_buff *gsb, > rc = kvmppc_gse_put_u64(gsb, iden, > vcpu->arch.vcore->vtb); > break; > + case KVMPPC_GSID_DPDES: > + rc = kvmppc_gse_put_u64(gsb, iden, > + vcpu->arch.vcore->dpdes); > + break; > case KVMPPC_GSID_LPCR: > rc = kvmppc_gse_put_u64(gsb, iden, > vcpu->arch.vcore->lpcr); > @@ -543,6 +547,9 @@ static int gs_msg_ops_vcpu_refresh_info(struct kvmppc_gs_msg *gsm, > case KVMPPC_GSID_VTB: > vcpu->arch.vcore->vtb = kvmppc_gse_get_u64(gse); > break; > + case KVMPPC_GSID_DPDES: > + vcpu->arch.vcore->dpdes = kvmppc_gse_get_u64(gse); > + break; > case KVMPPC_GSID_LPCR: > vcpu->arch.vcore->lpcr = kvmppc_gse_get_u64(gse); > break; I would split all the wiring up of the DPDES GSID stuff into its own patch, it obviously looks fine. > diff --git a/arch/powerpc/kvm/test-guest-state-buffer.c b/arch/powerpc/kvm/test-guest-state-buffer.c > index 4720b8dc8837..91ae660cfe21 100644 > --- a/arch/powerpc/kvm/test-guest-state-buffer.c > +++ b/arch/powerpc/kvm/test-guest-state-buffer.c > @@ -151,7 +151,7 @@ static void test_gs_bitmap(struct kunit *test) > i++; > } > > - for (u16 iden = KVMPPC_GSID_GPR(0); iden <= KVMPPC_GSID_CTRL; iden++) { > + for (u16 iden = KVMPPC_GSID_GPR(0); iden <= KVMPPC_GSID_DPDES; iden++) { > kvmppc_gsbm_set(&gsbm, iden); > kvmppc_gsbm_set(&gsbm1, iden); > KUNIT_EXPECT_TRUE(test, kvmppc_gsbm_test(&gsbm, iden)); It would be good to have a _LAST define for such loops. It's very easy to miss when adding KVMPPC_GSID_DPDES that you need to grep for KVMPPC_GSID_CTRL. Very easy to see that you need to update _LAST. You just need to work out a good name for it since there's a few "namespaces" of numbers with similar prefix. Good luck :) Thanks, Nick