On Mon, Aug 09, 2021 at 03:30:08PM +0000, Sean Christopherson wrote: > On Mon, Aug 09, 2021, Yu Zhang wrote: > > On Sun, Aug 08, 2021 at 11:33:44PM -0500, Wei Huang wrote: > > > > > > On 8/8/21 11:27 PM, Yu Zhang wrote: > > > > On Sun, Aug 08, 2021 at 11:11:40PM -0500, Wei Huang wrote: > > > > > > > > > > > > > > > On 8/8/21 10:58 PM, Yu Zhang wrote: > > > > > > On Sun, Aug 08, 2021 at 02:26:56PM -0500, Wei Huang wrote: > > > > > > > AMD future CPUs will require a 5-level NPT if host CR4.LA57 is set. > > > > > > > > > > > > Sorry, but why? NPT is not indexed by HVA. > > > > > > > > > > NPT is not indexed by HVA - it is always indexed by GPA. What I meant is NPT > > > > > page table level has to be the same as the host OS page table: if 5-level > > > > > page table is enabled in host OS (CR4.LA57=1), guest NPT has to 5-level too. > > > > > > > > I know what you meant. But may I ask why? > > > > > > I don't have a good answer for it. From what I know, VMCB doesn't have a > > > field to indicate guest page table level. As a result, hardware relies on > > > host CR4 to infer NPT level. > > > > I guess you mean not even in the N_CR3 field of VMCB? > > Correct, nCR3 is a basically a pure representation of a regular CR3. > > > Then it's not a broken design - it's a limitation of SVM. :) > > That's just a polite way of saying it's a broken design ;-) > > Joking aside, NPT opted for a semblance of backwards compatibility at the cost of > having to carry all the baggage that comes with a legacy design. Keeping the core > functionality from IA32 paging presumably miminizes design and hardware costs, and > required minimal enabling in hypervisors. The downside is that it's less flexible > than EPT and has a few warts, e.g. shadowing NPT is gross because the host can't > easily mirror L1's desired paging mode. Thanks for your explaination, Sean. Everything has a cost, and now it's time to pay the price. :) As to the max level, it is calculated in kvm_init(). Though I do not see any chance that host paging mode will be changed after kvm_init(), or any case that Linux uses different paging levels in different pCPUs, I am wondering, should we do something, e.g., to document this as an open? About "host can't easily mirror L1's desired paging mode", could you please elaborate? Thanks! B.R. Yu