Re: [PATCH 02/10] nEPT: MMU context for nested EPT

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Nov 12, 2011, Avi Kivity wrote about "Re: [PATCH 02/10] nEPT: MMU context for nested EPT":
> host may write-protect a page.  Second, the shadow and guest ptes may be
> in different formats (ept vs ia32).

I'm afraid I've lost you here... The shadow table and the to-be-shadowed 
table are both ia32 (this is the normal shadow table code), or both ept
(the nested tdp code). When are they supposed to be in different
formats (ept vs ia32)?

I'm also puzzled in what situation will the host will write-protect an EPT02
(shadow EPT) page?

> In fact that happens to accidentally work, no?  Intermediate ptes are
> always present/write/user, which translates to read/write/execute in EPT.

It didn't work because it also used to set the "accessed" bit, bit 5,
which on EPT is reserved and caused EPT misconfiguration. So I had to
fix link_shadow_page, or nested EPT would not work at all.

> Don't optimize for least changes, optimize for best result afterwards.

As I'm sure you remember, two years ago, in September 6 2009, you wrote in
your blog about the newly contributed nested VMX patch set, and in
particular its nested EPT (which predated the nested NPT contribution).

Nested EPT was, for some workloads, a huge performance improvement, but
you (if I understand correctly) did not want that code in KVM because
it, basically, optimized for getting the job done, in the most correct
and most efficient manner - but without regard of how cleanly this fit with
other types of shadowing (normal shadow page tables, and nested NPT),
or how much of the code was being duplicated or circumvented.

So this time around, I couldn't really "not optimize for least changes".
This time, the nested EPT had to fit (like a square peg in a round hole
;-)), into the preexisting MMU and NPT shadowing. I couldn't really just write
the most correct and most efficient code (which Orit Wasserman already
did, two years earlier). This time I needed to figure out the least obtrusive
way of changing the existing code. The hardest thing about doing this
was trying to understand all the complexities and subtleties of the existing
MMU code in KVM, which already does 101 different cases in one
overloaded piece of code, which is not commented or documented.
And of course, add to that all the complexities (some might even say "cruft")
which the underlying x86 architecture itself has acrued over the years.
So it's not surprising I've missed some of the important subtleties which
didn't have any effect in the typical case I've tried. Like I said, in my
tests nested EPT *did* work. And even getting to that point was hard enough :-)

> We need a third variant of walk_addr_generic that parses EPT format
> PTEs.  Whether that's best done by writing paging_ept.h or modifying
> paging_tmpl.h, I don't know.

Thanks. I'll think about everything you've said in this thread (I'm still
not convinced I understood all your points, so just understanding them
will be the first step). I'll see what I can do to improve the patch.

But I have to be honest - I'm not sure how quickly I can finish this.
I really appreciate all your comments about nested VMX in the last two
years - most of them have been spot-on, 100% correct, and really helpful
for making me understand things which I had previously misunderstood.
However, since you are (of course) extremely familiar with every nook and
cranny of KVM, what normally happens is that every comment which took you
5 minutes to figure out, takes me 5 days to fully understand, and to actually
write, debug and test the fixed code. Every review that takes you two days
to go through (and is very much appreciated!) takes me several months to fix
each and every thing you asked for.

Don't get me wrong, I *am* planning to continue working (part-time) on nested
VMX, and nested EPT in particular. But if you want it to pick up the pace,
I could use some help with actual coding from people who have much more
intimate knowledge of the non-nested-VMX parts of KVM than I have.

In the meantime, if anybody wants to experiment with a much faster
Nested VMX than we had before, you can try my current patch. It may not
be perfect, but in many ways it is better than the old shadow-on-ept code.
And in simple (64 bit, 4k page) kvm-over-kvm configurations like I tried, it
works well.

Nadav.

-- 
Nadav Har'El                        |                  Saturday, Nov 12 2011, 
nyh@xxxxxxxxxxxxxxxxxxx             |-----------------------------------------
Phone +972-523-790466, ICQ 13349191 |What's tiny, yellow and very dangerous? A
http://nadav.harel.org.il           |canary with the super-user password.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux