On Mon, Jan 14, 2013 at 10:05 AM, Marc Zyngier <marc.zyngier@xxxxxxx> wrote: > On 14/01/13 14:58, Christoffer Dall wrote: >> On Mon, Jan 14, 2013 at 4:56 AM, Marc Zyngier <marc.zyngier@xxxxxxx> wrote: >>> On 12/01/13 17:48, Christoffer Dall wrote: >>>> On Sat, Jan 12, 2013 at 6:19 AM, Marc Zyngier <maz@xxxxxxxxxxxxxxx> wrote: >>>>> On Sat, 12 Jan 2013 01:20:39 -0500, Christoffer Dall >>>>> <c.dall@xxxxxxxxxxxxxxxxxxxxxx> wrote: >>>>>> So, eh..., we seem to have forgotten to enable the data cache in Hyp >>>>>> mode. This makes things more faster. >>>>> >>>>> Faster should be enough. More faster feels like you're exceeding some >>>>> speed limit... >>>>> >>>>>> >>>>>> Signed-off-by: Christoffer Dall <c.dall@xxxxxxxxxxxxxxxxxxxxxx> >>>>>> --- >>>>>> arch/arm/kvm/init.S | 4 ++-- >>>>>> 1 file changed, 2 insertions(+), 2 deletions(-) >>>>>> >>>>>> diff --git a/arch/arm/kvm/init.S b/arch/arm/kvm/init.S >>>>>> index f179f10..67ec26c 100644 >>>>>> --- a/arch/arm/kvm/init.S >>>>>> +++ b/arch/arm/kvm/init.S >>>>>> @@ -90,8 +90,8 @@ __do_hyp_init: >>>>>> mrc p15, 0, r1, c1, c0, 0 @ SCTLR >>>>>> ldr r12, =(HSCTLR_EE | HSCTLR_FI) >>>>>> and r1, r1, r12 >>>>>> - ARM( ldr r12, =(HSCTLR_M | HSCTLR_A | HSCTLR_I) ) >>>>>> - THUMB( ldr r12, =(HSCTLR_M | HSCTLR_A | HSCTLR_I | HSCTLR_TE) ) >>>>>> + ARM( ldr r12, =(HSCTLR_M | HSCTLR_A | HSCTLR_I | HSCTLR_C ) ) >>>>>> + THUMB( ldr r12, =(HSCTLR_M | HSCTLR_A | HSCTLR_I | HSCTLR_C | >>>>> HSCTLR_TE) >>>>>> ) >>>>>> orr r1, r1, r12 >>>>>> orr r0, r0, r1 >>>>>> isb >>>>> >>>>> Nice catch. >>>>> >>>>> Though you may want to remove the C and I bits from HSCTLR_MASK instead, >>>>> so we can honour the CPU_ICACHE_DISABLE and CPU_DCACHE_DISABLE options even >>>>> in HYP mode. >>>>> >>>> Just to be sure, you mean this right? >>>> >>>> diff --git a/arch/arm/kvm/init.S b/arch/arm/kvm/init.S >>>> index 67ec26c..9f37a79 100644 >>>> --- a/arch/arm/kvm/init.S >>>> +++ b/arch/arm/kvm/init.S >>>> @@ -88,10 +88,10 @@ __do_hyp_init: >>>> ldr r12, =HSCTLR_MASK >>>> bic r0, r0, r12 >>>> mrc p15, 0, r1, c1, c0, 0 @ SCTLR >>>> - ldr r12, =(HSCTLR_EE | HSCTLR_FI) >>>> + ldr r12, =(HSCTLR_EE | HSCTLR_FI | HSCTLR_I | HSCTLR_C) >>>> and r1, r1, r12 >>>> - ARM( ldr r12, =(HSCTLR_M | HSCTLR_A | HSCTLR_I | HSCTLR_C ) ) >>>> - THUMB( ldr r12, =(HSCTLR_M | HSCTLR_A | HSCTLR_I | HSCTLR_C | HSCTLR_TE) ) >>>> + ARM( ldr r12, =(HSCTLR_M | HSCTLR_A) ) >>>> + THUMB( ldr r12, =(HSCTLR_M | HSCTLR_A | HSCTLR_TE) ) >>>> orr r1, r1, r12 >>>> orr r0, r0, r1 >>>> isb >>> >>> Yes, that's what I meant. I was trying to look at whether or not we >>> could completely get rid of HSCTLR_MASK (we can't), and ended up mixing >>> up the two things... >>> >> ok, I'll push this change then. > > Had a quick (and very non-scientific) hackbench run on both host and > guest this morning, and with this patch the guest is about 90% of the > host speed. > as I said - more faster ;) > It would be good to couple that with huge pages and see if that brings > us even closer (it should). > Is this on TC2? I see a 16% slowdown of hackbench running on Arndale using this patch and THP. -Christoffer _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/cucslists/listinfo/kvmarm