On 4/14/2022 3:03 AM, Sean Christopherson wrote: > On Thu, Mar 24, 2022, Manali Shukla wrote: >> Current implementation of nested page table does the page >> table build up statistically with 2048 PTEs and one pml4 entry. >> That is why current implementation is not extensible. >> >> New implementation does page table build up dynamically based >> on the RAM size of the VM which enables us to have separate >> memory range to test various npt test cases. >> >> Signed-off-by: Manali Shukla <manali.shukla@xxxxxxx> >> --- >> x86/svm.c | 163 ++++++++++++++++++++++++++++++++++---------------- > > Ok, so I got fairly far into reviewing this (see below, but it can be ignored) > before realizing that all this new code is nearly identical to what's in lib/x86/vm.c. > E.g. find_pte_level() and install_pte() can probably used almost verbatim. > > Instead of duplicating code, can you extend vm.c to as necessary? It might not > even require any changes. I'll happily clean up vm.c in the future, e.g. to fix > the misleading nomenclature and open coded horrors, but for your purposes I think > you should be able to get away with a bare minimum of changes. > >> x86/svm.h | 17 +++++- >> x86/svm_npt.c | 4 +- >> 3 files changed, 130 insertions(+), 54 deletions(-) >> >> diff --git a/x86/svm.c b/x86/svm.c >> index d0d523a..67dbe31 100644 >> --- a/x86/svm.c >> +++ b/x86/svm.c >> @@ -8,6 +8,7 @@ >> #include "desc.h" >> #include "msr.h" >> #include "vm.h" >> +#include "fwcfg.h" >> #include "smp.h" >> #include "types.h" >> #include "alloc_page.h" >> @@ -16,38 +17,67 @@ >> #include "vmalloc.h" >> >> /* for the nested page table*/ >> -u64 *pte[2048]; >> -u64 *pde[4]; >> -u64 *pdpe; >> u64 *pml4e; >> >> struct vmcb *vmcb; >> >> -u64 *npt_get_pte(u64 address) >> +u64* get_npt_pte(u64 *pml4, > > Heh, the usual way to handle wrappers is to add underscores, i.e. > > u64 *npt_get_pte(u64 address) > { > return __npt_get_pte(npt_get_pml4e(), address, 1); > } > > swapping the order just results in namespacing wierdness and doesn't convey to the > reader that this is an "inner" helper. > >> u64 guest_addr, int level) > > Assuming guest_addr is a gpa, call it gpa to avoid ambiguity over virtual vs. > physical. > >> { >> - int i1, i2; >> + int l; >> + u64 *pt = pml4, iter_pte; > > Please point pointers and non-pointers on separate lines. And just "pte" for > the tmp, it's not actually used as an iterator. And with that, I have a slight > preference for page_table over pt so that it's not mistaken for pte. > >> + unsigned offset; > > No bare unsigned please. And "offset" is the wrong terminology, "index" or "idx" > is preferable. An offset is usually an offset in bytes, this indexes into a u64 > array. > > Ugh, looks like that awful name comes from PGDIR_OFFSET in lib/x86/asm/page.h. > The offset, at least in Intel SDM terminology, it specifically the last N:0 bits > of the virtual address (or guest physical) that are the offset into the physical > page, e.g. 11:0 for a 4kb page, 20:0 for a 2mb page. > >> + >> + assert(level >= 1 && level <= 4); > > The upper bound should be NPT_PAGE_LEVEL, or root_level (see below). > >> + for(l = NPT_PAGE_LEVEL; ; --l) { > > Nit, need a space after "for". > > Also, can you plumb in the root level? E.g. have npt_get_pte() hardcode the > root in this case. At some point this will hopefully support 5-level NPT, at > which point hardcoding the root will require updating more code than should be > necessary. > >> + offset = (guest_addr >> (((l - 1) * NPT_PGDIR_WIDTH) + 12)) >> + & NPT_PGDIR_MASK; > > Not your code (I think), but NPT_PGDIR_MASK is an odd name since it's common to > all. The easiest thing would be to loosely follow KVM. Actually, I think it > makes sense to grab the PT64_ stuff from KVM > > #define PT64_BASE_ADDR_MASK (((1ULL << 52) - 1) & ~(u64)(PAGE_SIZE-1)) > #define PT64_LEVEL_BITS 9 > #define PT64_LEVEL_SHIFT(level) \ > (PAGE_SHIFT + (level - 1) * PT64_LEVEL_BITS) > #define PT64_INDEX(address, level)\ > (((address) >> PT64_LEVEL_SHIFT(level)) & ((1 << PT64_LEVEL_BITS) - 1)) > > > and then use those instead of having dedicated NPT_* defines. That makes it more > obvious that (a) SVM/NPT tests are 64-bit only and (b) there's nothing special > about NPT with respect to "legacy" 64-bit paging. > > That will provide a nice macro, PT64_INDEX, to replace the open coded calcuations. > >> + if (l == level) >> + break; >> + if (!(iter_pte & NPT_PRESENT)) >> + return false; > > Return "false" works, but it's all kinds of wrong. This should either assert or > return NULL. > >> + pt = (u64*)(iter_pte & PT_ADDR_MASK); >> + } >> + offset = (guest_addr >> (((l - 1) * NPT_PGDIR_WIDTH) + 12)) >> + & NPT_PGDIR_MASK; > > Hmm, this is unnecessary because the for-loop can't terminate on its own, it > can only exit on "l == level", and offset is already correct in that case. Hey Sean, Thank you so much for reviewing the code. I will work on the comments. - Manali