On Tue, 6 Dec 2016 17:09:07 +1100 Paul Mackerras <paulus@xxxxxxxxxx> wrote: > On Thu, Dec 01, 2016 at 06:18:10PM +1100, Nicholas Piggin wrote: > > Change the calling convention to put the trap number together with > > CR in two halves of r12, which frees up HSTATE_SCRATCH2 in the HV > > handler, and r9 free. > > Cute idea! Some comments below... > > > The 64-bit PR handler entry translates the calling convention back > > to match the previous call convention (i.e., shared with 32-bit), for > > simplicity. > > > > Signed-off-by: Nicholas Piggin <npiggin@xxxxxxxxx> > > --- > > arch/powerpc/include/asm/exception-64s.h | 28 +++++++++++++++------------- > > arch/powerpc/kvm/book3s_hv_rmhandlers.S | 15 +++++++-------- > > arch/powerpc/kvm/book3s_segment.S | 27 ++++++++++++++++++++------- > > 3 files changed, 42 insertions(+), 28 deletions(-) > > > > diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h > > index 9a3eee6..bc8fc45 100644 > > --- a/arch/powerpc/include/asm/exception-64s.h > > +++ b/arch/powerpc/include/asm/exception-64s.h > > @@ -233,7 +233,7 @@ END_FTR_SECTION_NESTED(ftr,ftr,943) > > > > #endif > > > > -#define __KVM_HANDLER_PROLOG(area, n) \ > > +#define __KVM_HANDLER(area, h, n) \ > > BEGIN_FTR_SECTION_NESTED(947) \ > > ld r10,area+EX_CFAR(r13); \ > > std r10,HSTATE_CFAR(r13); \ > > @@ -243,30 +243,32 @@ END_FTR_SECTION_NESTED(ftr,ftr,943) > > std r10,HSTATE_PPR(r13); \ > > END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,948); \ > > ld r10,area+EX_R10(r13); \ > > - stw r9,HSTATE_SCRATCH1(r13); \ > > - ld r9,area+EX_R9(r13); \ > > std r12,HSTATE_SCRATCH0(r13); \ > > - > > -#define __KVM_HANDLER(area, h, n) \ > > - __KVM_HANDLER_PROLOG(area, n) \ > > - li r12,n; \ > > + li r12,(n); \ > > + sldi r12,r12,32; \ > > + or r12,r12,r9; \ > > Did you consider doing it the other way around, i.e. with r12 > containing (cr << 32) | trap? That would save 1 instruction in each > handler: When I tinkered with it I thought it came out slightly nicer this way, but your suggested versions seem to prove me wrong. I can change it if you'd like. > > + sldi r12,r9,32; \ > + ori r12,r12,(n); \ > > > + ld r9,area+EX_R9(r13); \ > > + std r9,HSTATE_SCRATCH1(r13); \ > > Why not put this std in kvmppc_interrupt[_hv] rather than in each > handler? Patch 3/3 uses r9 to load the ctr when CONFIG_RELOCATABLE is turned on, so this resulted in the smaller difference between the two cases. I agree it's not ideal when config relocatable is off. [snip] Thanks, Nick -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html