Re: [PATCH 02/10] nEPT: MMU context for nested EPT

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/12/2011 11:37 PM, Nadav Har'El wrote:
> On Sat, Nov 12, 2011, Avi Kivity wrote about "Re: [PATCH 02/10] nEPT: MMU context for nested EPT":
> > host may write-protect a page.  Second, the shadow and guest ptes may be
> > in different formats (ept vs ia32).
>
> I'm afraid I've lost you here... The shadow table and the to-be-shadowed 
> table are both ia32 (this is the normal shadow table code), or both ept
> (the nested tdp code). When are they supposed to be in different
> formats (ept vs ia32)?

Er, the ia32/ept combo only happens when the host ignores the ia32 ptes
(non-nested on ept), so it's not very interesting.

> I'm also puzzled in what situation will the host will write-protect an EPT02
> (shadow EPT) page?

We only protect guest pages, but the permissions for them are in the
shadow pages.

> > In fact that happens to accidentally work, no?  Intermediate ptes are
> > always present/write/user, which translates to read/write/execute in EPT.
>
> It didn't work because it also used to set the "accessed" bit, bit 5,
> which on EPT is reserved and caused EPT misconfiguration. So I had to
> fix link_shadow_page, or nested EPT would not work at all.

Look at how __direct_map() does it.

> > Don't optimize for least changes, optimize for best result afterwards.
>
> As I'm sure you remember, two years ago, in September 6 2009, you wrote in
> your blog about the newly contributed nested VMX patch set, and in
> particular its nested EPT (which predated the nested NPT contribution).
>
> Nested EPT was, for some workloads, a huge performance improvement, but
> you (if I understand correctly) did not want that code in KVM because
> it, basically, optimized for getting the job done, in the most correct
> and most efficient manner - but without regard of how cleanly this fit with
> other types of shadowing (normal shadow page tables, and nested NPT),
> or how much of the code was being duplicated or circumvented.

I mean "best result" in terms of maintainability - how the code will
look, not performance results.  Don't optimize the size of the patch,
optimize the size of the patched code (and don't take me literally -
small code size doesn't correlate with maintainable code).

> So this time around, I couldn't really "not optimize for least changes".
> This time, the nested EPT had to fit (like a square peg in a round hole
> ;-)), into the preexisting MMU and NPT shadowing. I couldn't really just write
> the most correct and most efficient code (which Orit Wasserman already
> did, two years earlier). 

What is not correct (apart from what we identified) or not efficient in
the current code?

> This time I needed to figure out the least obtrusive
> way of changing the existing code. The hardest thing about doing this
> was trying to understand all the complexities and subtleties of the existing
> MMU code in KVM, which already does 101 different cases in one
> overloaded piece of code, which is not commented or documented.
> And of course, add to that all the complexities (some might even say "cruft")
> which the underlying x86 architecture itself has acrued over the years.
> So it's not surprising I've missed some of the important subtleties which
> didn't have any effect in the typical case I've tried. Like I said, in my
> tests nested EPT *did* work. And even getting to that point was hard enough :-)
>
> > We need a third variant of walk_addr_generic that parses EPT format
> > PTEs.  Whether that's best done by writing paging_ept.h or modifying
> > paging_tmpl.h, I don't know.
>
> Thanks. I'll think about everything you've said in this thread (I'm still
> not convinced I understood all your points, so just understanding them
> will be the first step). I'll see what I can do to improve the patch.
>
> But I have to be honest - I'm not sure how quickly I can finish this.
> I really appreciate all your comments about nested VMX in the last two
> years - most of them have been spot-on, 100% correct, and really helpful
> for making me understand things which I had previously misunderstood.
> However, since you are (of course) extremely familiar with every nook and
> cranny of KVM, what normally happens is that every comment which took you
> 5 minutes to figure out, takes me 5 days to fully understand, and to actually
> write, debug and test the fixed code. Every review that takes you two days
> to go through (and is very much appreciated!) takes me several months to fix
> each and every thing you asked for.

Feel free to ask around, on the mailing list and on IRC, post questions
or pseudo code for review.  Some problems can be caught early.

> Don't get me wrong, I *am* planning to continue working (part-time) on nested
> VMX, and nested EPT in particular. But if you want it to pick up the pace,
> I could use some help with actual coding from people who have much more
> intimate knowledge of the non-nested-VMX parts of KVM than I have.

I do plan to write some code, but it will actually make your job
somewhat harder - I'd like to write a test framework for nvmx, which
will test all sorts of odd combinations.

> In the meantime, if anybody wants to experiment with a much faster
> Nested VMX than we had before, you can try my current patch. It may not
> be perfect, but in many ways it is better than the old shadow-on-ept code.
> And in simple (64 bit, 4k page) kvm-over-kvm configurations like I tried, it
> works well.

Easiest if you post a git URL.

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux