On Wed, Feb 13, 2013 at 11:21 AM, Marc Zyngier <marc.zyngier@xxxxxxx> wrote: > On 13/02/13 16:07, Christoffer Dall wrote: >> On Wed, Feb 13, 2013 at 10:46 AM, Marc Zyngier <marc.zyngier@xxxxxxx> wrote: >>> At the moment, KVM/ARM is quite heavy handed when it comes to i-cache >>> invalidation, as it is flushed on each stage-2 mapping. >>> >>> An alternative is to mark each page as non-executable (courtesy of the >>> XN flag), and then to invalidate the i-cache when the CPU tries to >>> execute a page. >>> >>> We're basically trading off invalidation for faults. Performance wise, >>> the performance difference is very modest (I've seen a 0.2% >>> improvement over 10 runs of "hackbench 100 process 1000"). But the >>> important thing in my opinion is that it reduces the impact of the VM >>> on the whole system (fault handling only impact the VM while >>> invalidation is global). >>> >>> Code wise, this introduce a bit of restructuring in our stage-2 >>> manipulation code, making the code a bit cleaner (IMHO). Note that >>> these patches are against my arm64 branch, and won't apply on anything >>> else. >>> >>> As always, comments welcome. >>> >> >> Hey Marc, >> >> I'll give this a once-over. From my initial glance there are some >> issues with the stage2_get_pte stuff if you think about section >> mappings later on, and while I kind of see the background, I'm not >> sure it's as bad as it's made out to be. > > We certainly need to cater for section mapping, and that would probably > deserve a stage2_{get_pmd,set_pmd_at}. > > My problem with the single stage2_set_pte function is that it makes it > very hard to perform a read-modify-write operation on an existing PTE. > We don't really need it now, as we only ever change one single bit (RW). > But as soon as you need a second bit (XN, for example), you're screwed. > The pathological example being a page that has both code and data that > needs to be written. > it's definitely on the heavy side. I'll give it a more careful review. >> In any case, we really need to measure the impact of this on both a >> wider range of workloads inside the VM and on the host performance. > > Agreed. I plan to do that at one point once I get all my pending patches > properly aligned (a bit busy with BE guests at the moment). > BE as in Big Endian? How come that's a priority? (just curious) As per the measurements, let's sync, I may be doing something similar. I use this horrible set of scripts to measure performance stuff, they may be useful to you too: https://github.com/chazy/kvmperf -Christoffer _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/cucslists/listinfo/kvmarm