On Fri, Jan 03, 2025 at 07:32:46AM +0800, Edgecombe, Rick P wrote: > > +u64 tdh_mem_page_aug(struct tdx_td *td, u64 gpa, u64 hpa, u64 *rcx, u64 *rdx) > > +{ > > hpa should be struct page, or as Yan had been ready to propose a folio and idx. The consideration of folio + idx is to make sure the physical addresses for "level" is contiguous, while allowing KVM to request mapping of a small page contained in a huge folio. > I would have thought a struct page would be sufficient for now. She also planned > to add a level arg, which today should always be 4k, but would be needed for > future huge page support. Yes, in previous version, "level" is embedded in param "gpa" and implicitly set to 0 by KVM. The planned changes to tdh_mem_page_aug() are as follows: - Use struct tdx_td instead of raw TDR u64. - Use extended_err1/2 instead of rcx/rdx for output. - Change "u64 gpa" to "gfn_t gfn". - Use union tdx_sept_gpa_mapping_info to initialize args.rcx. - Use "struct folio *" + "unsigned long folio_page_idx" instead of raw hpa for guest page in tdh_mem_page_aug() and fail if a page (huge or not) to aug is not contained in a single folio. - Add a new param "level". - Fail the wrapper if "level" is not 4K-1G. - Call tdx_clflush_page() instead of clflush_cache_range() and loops tdx_clflush_page() for each 4k page in a huge page to aug. +u64 tdh_mem_page_aug(struct tdx_td *td, gfn_t gfn, int level, struct folio *private_folio, + unsigned long folio_page_idx, u64 *extended_err1, u64 *extended_err2) +{ + union tdx_sept_gpa_mapping_info gpa_info = { .level = level, .gfn = gfn, }; + struct tdx_module_args args = { + .rcx = gpa_info.full, + .rdx = tdx_tdr_pa(td), + .r8 = page_to_phys(folio_page(private_folio, folio_page_idx)), + }; + unsigned long nr_pages = 1 << (level * 9); + u64 ret; + + if (!(level >= TDX_PS_4K && level < TDX_PS_NR) || + (folio_page_idx + nr_pages > folio_nr_pages(private_folio))) + return -EINVAL; + + while (nr_pages--) + tdx_clflush_page(folio_page(private_folio, folio_page_idx++)); + + ret = seamcall_ret(TDH_MEM_PAGE_AUG, &args); + + *extended_err1 = args.rcx; + *extended_err2 = args.rdx; + + return ret; +} +EXPORT_SYMBOL_GPL(tdh_mem_page_aug); The corresponding changes in KVM: static void tdx_unpin(struct kvm *kvm, kvm_pfn_t pfn) { - put_page(pfn_to_page(pfn)); + folio_put(page_folio(pfn_to_page(pfn))); } static int tdx_mem_page_aug(struct kvm *kvm, gfn_t gfn, enum pg_level level, kvm_pfn_t pfn) { struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm); - hpa_t tdr_pa = page_to_phys(kvm_tdx->td.tdr_page); - hpa_t hpa = pfn_to_hpa(pfn); - gpa_t gpa = gfn_to_gpa(gfn); + int tdx_level = pg_level_to_tdx_sept_level(level); + struct page *private_page = pfn_to_page(pfn); u64 entry, level_state; u64 err; - err = tdh_mem_page_aug(tdr_pa, gpa, hpa, &entry, &level_state); + err = tdh_mem_page_aug(&kvm_tdx->td, gfn, tdx_level, page_folio(private_page), + folio_page_idx(page_folio(private_page), private_page), + &entry, &level_state); if (unlikely(err & TDX_OPERAND_BUSY)) { tdx_unpin(kvm, pfn); return -EBUSY; @@ -1620,9 +1621,9 @@ int tdx_sept_set_private_spte(struct kvm *kvm, gfn_t gfn, * migration. Until guest_memfd supports page migration, prevent page * migration. * TODO: Once guest_memfd introduces callback on page migration, - * implement it and remove get_page/put_page(). + * implement it and remove folio_get/folio_put(). */ - get_page(pfn_to_page(pfn)); + folio_get(page_folio(pfn_to_page(pfn))); > I think we should try to keep it as simple as possible for now. Yeah. So, do you think we need to have tdh_mem_page_aug() to support 4K level page only and ask for Dave's review again for huge page? Do we need to add param "level" ? - if yes, "struct page" looks not fit. - if not, hardcode it as 0 in the wrapper and convert "pfn" to "struct page"?