On Mon, Apr 22, 2024 at 6:29 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > +Tom, Mike, and Peter > > On Thu, Apr 18, 2024, Li RongQing wrote: > > save_area of per-CPU svm_data are dominantly accessed from their > > own local CPUs, so allocate them node-local for performance reason > > > > Signed-off-by: Li RongQing <lirongqing@xxxxxxxxx> > > --- > > arch/x86/kvm/svm/sev.c | 6 +++--- > > arch/x86/kvm/svm/svm.c | 2 +- > > arch/x86/kvm/svm/svm.h | 6 +++++- > > 3 files changed, 9 insertions(+), 5 deletions(-) > > > > diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c > > index 61a7531..cce8ec7 100644 > > --- a/arch/x86/kvm/svm/sev.c > > +++ b/arch/x86/kvm/svm/sev.c > > @@ -3179,13 +3179,13 @@ void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector) > > ghcb_set_sw_exit_info_2(svm->sev_es.ghcb, 1); > > } > > > > -struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu) > > +struct page *snp_safe_alloc_page_node(struct kvm_vcpu *vcpu, int node) > > { > > unsigned long pfn; > > struct page *p; > > > > if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP)) > > - return alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO); > > + return alloc_pages_node(node, GFP_KERNEL_ACCOUNT | __GFP_ZERO, 0); > > > > /* > > * Allocate an SNP-safe page to workaround the SNP erratum where > > @@ -3196,7 +3196,7 @@ struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu) > > * Allocate one extra page, choose a page which is not > > * 2MB-aligned, and free the other. > > */ > > - p = alloc_pages(GFP_KERNEL_ACCOUNT | __GFP_ZERO, 1); > > + p = alloc_pages_node(node, GFP_KERNEL_ACCOUNT | __GFP_ZERO, 1); > > This made me realize the existing code is buggy. The allocation for the per-CPU > save area shouldn't be accounted. > > Also, what's the point of taking @vcpu? It's a nice enough flag to say "this > should be accounted", but it's decidely odd. > > How about we fix both in a single series, and end up with this over 3-4 patches? > I.e. pass -1 where vcpu is non-NULL, and the current CPU for the save area. Looks good to me. Internally we already use GFP_KERNEL for these allocations. But we had an issue with split_page() and memcg accounting internally. Yosry submitted the following: + if (memcg_kmem_charge(p, GFP_KERNEL_ACCOUNT, 0)) { + __free_page(p); + return NULL; + } Not sure if this is an issue with our kernel or if we should use split_page_memcg() here? It was suggested internally but we didn't want to backport it. > > struct page *snp_safe_alloc_page(int cpu) > { > unsigned long pfn; > struct page *p; > gfp_t gpf; > int node; > > if (cpu >= 0) { > node = cpu_to_node(cpu); > gfp = GFP_KERNEL; > } else { > node = NUMA_NO_NODE; > gfp = GFP_KERNEL_ACCOUNT > } > gfp |= __GFP_ZERO; > > if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP)) > return alloc_pages_node(node, gfp, 0); > > /* > * Allocate an SNP-safe page to workaround the SNP erratum where > * the CPU will incorrectly signal an RMP violation #PF if a > * hugepage (2MB or 1GB) collides with the RMP entry of a > * 2MB-aligned VMCB, VMSA, or AVIC backing page. > * > * Allocate one extra page, choose a page which is not > * 2MB-aligned, and free the other. > */ > p = alloc_pages_node(node, gfp, 1); > if (!p) > return NULL; > > split_page(p, 1); > > pfn = page_to_pfn(p); > if (IS_ALIGNED(pfn, PTRS_PER_PMD)) > __free_page(p++); > else > __free_page(p + 1); > > return p; > } >