On Sat, Nov 12, 2011, Avi Kivity wrote about "Re: [PATCH 02/10] nEPT: MMU context for nested EPT": > host may write-protect a page. Second, the shadow and guest ptes may be > in different formats (ept vs ia32). I'm afraid I've lost you here... The shadow table and the to-be-shadowed table are both ia32 (this is the normal shadow table code), or both ept (the nested tdp code). When are they supposed to be in different formats (ept vs ia32)? I'm also puzzled in what situation will the host will write-protect an EPT02 (shadow EPT) page? > In fact that happens to accidentally work, no? Intermediate ptes are > always present/write/user, which translates to read/write/execute in EPT. It didn't work because it also used to set the "accessed" bit, bit 5, which on EPT is reserved and caused EPT misconfiguration. So I had to fix link_shadow_page, or nested EPT would not work at all. > Don't optimize for least changes, optimize for best result afterwards. As I'm sure you remember, two years ago, in September 6 2009, you wrote in your blog about the newly contributed nested VMX patch set, and in particular its nested EPT (which predated the nested NPT contribution). Nested EPT was, for some workloads, a huge performance improvement, but you (if I understand correctly) did not want that code in KVM because it, basically, optimized for getting the job done, in the most correct and most efficient manner - but without regard of how cleanly this fit with other types of shadowing (normal shadow page tables, and nested NPT), or how much of the code was being duplicated or circumvented. So this time around, I couldn't really "not optimize for least changes". This time, the nested EPT had to fit (like a square peg in a round hole ;-)), into the preexisting MMU and NPT shadowing. I couldn't really just write the most correct and most efficient code (which Orit Wasserman already did, two years earlier). This time I needed to figure out the least obtrusive way of changing the existing code. The hardest thing about doing this was trying to understand all the complexities and subtleties of the existing MMU code in KVM, which already does 101 different cases in one overloaded piece of code, which is not commented or documented. And of course, add to that all the complexities (some might even say "cruft") which the underlying x86 architecture itself has acrued over the years. So it's not surprising I've missed some of the important subtleties which didn't have any effect in the typical case I've tried. Like I said, in my tests nested EPT *did* work. And even getting to that point was hard enough :-) > We need a third variant of walk_addr_generic that parses EPT format > PTEs. Whether that's best done by writing paging_ept.h or modifying > paging_tmpl.h, I don't know. Thanks. I'll think about everything you've said in this thread (I'm still not convinced I understood all your points, so just understanding them will be the first step). I'll see what I can do to improve the patch. But I have to be honest - I'm not sure how quickly I can finish this. I really appreciate all your comments about nested VMX in the last two years - most of them have been spot-on, 100% correct, and really helpful for making me understand things which I had previously misunderstood. However, since you are (of course) extremely familiar with every nook and cranny of KVM, what normally happens is that every comment which took you 5 minutes to figure out, takes me 5 days to fully understand, and to actually write, debug and test the fixed code. Every review that takes you two days to go through (and is very much appreciated!) takes me several months to fix each and every thing you asked for. Don't get me wrong, I *am* planning to continue working (part-time) on nested VMX, and nested EPT in particular. But if you want it to pick up the pace, I could use some help with actual coding from people who have much more intimate knowledge of the non-nested-VMX parts of KVM than I have. In the meantime, if anybody wants to experiment with a much faster Nested VMX than we had before, you can try my current patch. It may not be perfect, but in many ways it is better than the old shadow-on-ept code. And in simple (64 bit, 4k page) kvm-over-kvm configurations like I tried, it works well. Nadav. -- Nadav Har'El | Saturday, Nov 12 2011, nyh@xxxxxxxxxxxxxxxxxxx |----------------------------------------- Phone +972-523-790466, ICQ 13349191 |What's tiny, yellow and very dangerous? A http://nadav.harel.org.il |canary with the super-user password. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html