On Thu, Oct 11, 2012 at 8:54 AM, Alexander Graf <agraf@xxxxxxx> wrote: > > On 11.10.2012, at 12:37, Peter Maydell wrote: > >> On 11 October 2012 11:27, Marc Zyngier <marc.zyngier@xxxxxxx> wrote: >>> On 11/10/12 10:55, Alexander Graf wrote >>>> On 11.10.2012, at 11:46, Marc Zyngier <marc.zyngier@xxxxxxx> wrote: >>>>> - Our memory bandwidth is reduced by the number of TLB entries we waste >>>>> by not using section mappings instead of 4kB pages. Running hackbench on >>>>> a guest shows quite a slowdown that should mostly go away if/when >>>>> userspace switches to huge pages as backing store. I expect virtio to >>>>> suffer from the same problem. >>>> >>>> That one should be in a completely different ballpark. I'd be very >>> surprised if you get more than 10% slowdowns in TLB miss intensive >>> workloads. Definitely not as low of a hanging fruit as we see here. >> >>> The A15 uses separate TLBs for stage-1 and stage-2 translation, which >>> means that each random memory access costs us two TLBs, effectively >>> reducing the efficiency of the translation hardware by 50%. >>> >>> Using 2MB sections as the backing store (stage-2) should definitely be a >>> quite an improvement. >> >> Having done a quick google and looked at the QEMU code, I think that >> hugepage support should be a purely kernel-side thing: if you add >> transparent hugepage support then QEMU will just DTRT without special >> config. >> >> (Alex: does that sound right?) > > Don't bother with transparent hugepage support for now. Just use -mem-path and a directly mounted hugetlbfs with a page pool. But yes, no changes needed in QEMU at all. All the infrastructure for that is there already. > > If you need code to look at on how this works, check out > > http://lxr.free-electrons.com/source/arch/powerpc/kvm/e500_tlb.c?a=powerpc#L498 > > I'm working on this already. -Christoffer _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/cucslists/listinfo/kvmarm