On 15.06.2011, at 22:58, Scott Wood wrote: > On Wed, 15 Jun 2011 13:34:06 +0200 > Alexander Graf <agraf@xxxxxxx> wrote: > >> >> On 15.06.2011, at 12:50, Alexander Graf wrote: >>> What are your results when using the magic page? I have the following numbers with your patches applied: >>> >>> == bare metal == >>> >>> root@e500:~/kvm# time for i in {1..1000}; do /bin/echo > /dev/null; done >>> >>> real 0m5.445s >>> user 0m0.204s >>> sys 0m0.572s >>> >>> >>> == no hypervisor node (magic page not used) == >>> >>> debian-powerpc:~# time for i in {1..1000}; do /bin/echo > /dev/null; done >>> >>> real 1m36.362s >>> user 0m13.224s >>> sys 1m11.084s >>> >>> >>> == with hypervisor node (magic page used) == >>> >>> debian-powerpc:~# time for i in {1..1000}; do /bin/echo > /dev/null; done >>> >>> real 2m28.888s >>> user 0m9.248s >>> sys 1m4.016s >> >> Interesting - now I'm down to: >> >> debian-powerpc:~# time for i in {1..1000}; do /bin/echo > /dev/null; done >> >> real 1m25.008s >> user 0m9.224s >> sys 1m5.720s >> >> >> Oh well, let's hope I did something wrong before :). > > Remember, I have more paravirt patches coming after this (wanted to get the > MMU stuff dealt with first), and the kernel is still using 4K TLB1 pages in > the default qemu config. We should probably use TLB0 when large pages > aren't available. > > Without paravirt, no large pages: > > sh-2.05b# time for i in $(seq 1000); do /bin/echo > /dev/null ; done > > real 0m42.769s > user 0m3.256s > sys 0m34.988s > > With paravirt including my local patches (but still no large pages): Do these include patches to move the MAS registers to the shared page? That should reduce the instruction traps by a significant number. > > sh-2.05b# time for i in $(seq 1000); do /bin/echo > /dev/null ; done > > real 0m40.339s > user 0m1.560s > sys 0m32.652s > > With large pages and no paravirt: > > sh-2.05b# time for i in $(seq 1000); do /bin/echo > /dev/null ;done > > real 0m7.986s Wow, so this is where all the time gets wasted. Sounds like the guest's kernel eats up all of it. I assume "large pages" means direct map? > user 0m2.528s > sys 0m3.232s > > With large pages and paravirt, but just this patchset (no further paravirt > patches): > > sh-2.05b# time for i in $(seq 1000); do /bin/echo > /dev/null ; done > > real 0m6.067s > user 0m3.068s > sys 0m2.332s > > With large pages and all my paravirt patches: Mind to give me a list of patches that you have in the queue? Nothing fancy, just the instructions that you're already looking at. > sh-2.05b# time for i in $(seq 1000); do /bin/echo > /dev/null ;done > > real 0m3.837s > user 0m0.604s > sys 0m0.316s > > On the host (different rfs, but I think similar in relevant ways, except > that the host rfs has SPE and guest rfs is soft-float): > > # time for i in $(seq 1000); do /bin/echo > /dev/null ; done > > real 0m1.850s > user 0m0.028s > sys 0m0.236s > > I used seq because my rfs is using an older bash that doesn't seem to > understand the range expression. Sure, it's very valuable benchmarking data nevertheless! I just use the bash range thing because it's easier to type - and slightly faster. Thanks a lot for these numbers. Alex -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html