On Mon, Apr 12, 2010 at 06:28:44PM +1000, Nick Piggin wrote: > If virtualization is the main worry (which it seems that it is > seeing as your TLB misses cost like 6 times more cachelines), > then complexity should be pushed into the hypervisor, not the > core kernel. It's not just about virtualization on host, or I could have done a much smaller patch without bothering so much to make something as universal as possible with cows and stuff. Also about virtualization you forget that the CPU can establish 2M tlb entries in guest only if both the guest and the host shadow pagetables are both pmd_huge, if one of the two pmd isn't huge then the guest virtual to host physical translation won't be the same for all 512 4k pages (well it might be if you're extremely lucky but I strongly doubt the CPU bothers to check the host pfns are contiguous if both guest pmd and shadow pmd aren't huge). In other words we've to do something that is totally disconnected from virtualization, in order to advantage of it to the maximum extent with virt ;). This allows to leverage the KVM design compared to vmware or and the other inferior virtualization designs. We make gcc run 8% faster on a cheap single socket workstation without virt, and we get even bigger cumulative boost in virtualized gcc without changing anything at all in KVM. If this isn't the obvious best way to go, I don't know what it is! ;) > And that involves auditing and rewriting anything that allocates > and pins kernel memory. It's not only dentries. All not short lived gup pins have to use mmu notifier, no piece of the kernel is allowed to keep movable pages pinned for more than the time it takes to complete the DMA. It has to be fixed to provide all other benefits with GRU, XPMEM now that VM locks are switching to mutex (and as usual to KVM too). -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>