Hwllo Thank you for the review. On Sat, May 14, 2022 at 7:43 AM David Matlack <dmatlack@xxxxxxxxxx> wrote: > > +/* > > + * Special pages are pages to hold PAE PDPTEs for 32bit guest or higher level > > + * pages linked to special page when shadowing NPT. > > + * > > + * Special pages are specially allocated. If sp->spt needs to be 32bit, it > > I'm not sure what you mean by "If sp->spt needs to be 32bit". Do you mean > "If sp shadows a 32-bit PAE page table"? > "If sp->spt needs to be put in a 32bit CR3 (even on x86_64)" > > + * will use the preallocated mmu->pae_root. > > + * > > + * Special pages are only visible to local VCPU except through rmap from their > > + * children, so they are not in the kvm->arch.active_mmu_pages nor in the hash. > > + * > > + * And they are either accounted nor write-protected since they don't has gfn > > + * associated. > > Instead of "has gfn associated", how about "shadow a guest page table"? > Did in v3. > > + * > > + * Special pages can be obsoleted but might be possibly reused later. When > > + * the obsoleting process is done, all the obsoleted shadow pages are unlinked > > + * from the special pages by the help of the parent rmap of the children and > > + * the special pages become theoretically valid again. If there is no other > > + * event to cause a VCPU to free the root and the VCPU is being preempted by > > + * the host during two obsoleting processes, the VCPU can reuse its special > > + * pages when it is back. > > Sorry I am having a lot of trouble parsing this paragraph. > This paragraph is rewritten in v3. > > + */ > > This comment (and more broadly, this series) mixes "special page", > "special root", "special root page", and "special shadow page". Can you > be more consistent with the terminology? > In v3, there are only "local shadow page" and "local root shadow page". And "local root shadow page" can be shorted as "local root page". > > +static struct kvm_mmu_page *kvm_mmu_alloc_special_page(struct kvm_vcpu *vcpu, > > + union kvm_mmu_page_role role) > > +{ > > + struct kvm_mmu_page *sp; > > + > > + sp = kvm_mmu_memory_cache_alloc(&vcpu->arch.mmu_page_header_cache); > > + sp->gfn = 0; > > + sp->role = role; > > + if (role.level == PT32E_ROOT_LEVEL && > > + vcpu->arch.mmu->root_role.level == PT32E_ROOT_LEVEL) > > + sp->spt = vcpu->arch.mmu->pae_root; > > Why use pae_root here instead of allocating from the cache? Because of 32bit cr3. The comment is updated in V3. > > +static void mmu_free_special_root_page(struct kvm *kvm, struct kvm_mmu *mmu) > > +{ > > + u64 spte = mmu->root.hpa; > > + struct kvm_mmu_page *sp = to_shadow_page(spte & PT64_BASE_ADDR_MASK); > > + int i; > > + > > + /* Free level 5 or 4 roots for shadow NPT for 32 bit L1 */ > > + while (sp->role.level > PT32E_ROOT_LEVEL) > > + { > > + spte = sp->spt[0]; > > + mmu_page_zap_pte(kvm, sp, sp->spt + 0, NULL); > > Instead of using mmu_page_zap_pte(..., NULL) what about creating a new > helper that just does drop_parent_pte(), since that's all you really > want? > There are several places using mmu_page_zap_pte(..., NULL) in the mmu.c. mmu_page_zap_pte() is more general and reviewers don't need to understand extra code and extra comments. For example, some comments are needed to explain that sp->spt[i] is not a large page when disconnecting PAE root from the 4 PAE page directories if using a helper that just does drop_parent_pte(). > > + free_page((unsigned long)sp->spt); > > + kmem_cache_free(mmu_page_header_cache, sp); > > + if (!is_shadow_present_pte(spte)) > > + return; > > + sp = to_shadow_page(spte & PT64_BASE_ADDR_MASK); > > + } > > + > > + if (WARN_ON_ONCE(sp->role.level != PT32E_ROOT_LEVEL)) > > + return; > > + > > + /* Free PAE roots */ > > nit: This loop does not do any freeing, it just disconnets the PAE root > table from the 4 PAE page directories. So how about: > > /* Disconnect PAE root from the 4 PAE page directories */ > Did in v3. Thanks Lai