On 7/30/21 9:00 AM, Vlastimil Babka wrote: > On 7/7/21 8:35 PM, Brijesh Singh wrote: >> --- a/mm/memory.c >> +++ b/mm/memory.c >> @@ -4407,6 +4407,15 @@ static vm_fault_t handle_pte_fault(struct vm_fault *vmf) >> return 0; >> } >> >> +static int handle_split_page_fault(struct vm_fault *vmf) >> +{ >> + if (!IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT)) >> + return VM_FAULT_SIGBUS; >> + >> + __split_huge_pmd(vmf->vma, vmf->pmd, vmf->address, false, NULL); >> + return 0; >> +} >> + > I think back in v1 Dave asked if khugepaged will just coalesce this back, and it > wasn't ever answered AFAICS. > > I've checked the code and I think the answer is: no. Khugepaged isn't designed > to coalesce a pte-mapped hugepage back to pmd in place. And the usual way (copy > to a new huge page) I think will not succeed because IIRC the page is also > FOLL_PIN pinned and khugepaged_scan_pmd() will see the elevated refcounts via > is_refcount_suitable() and give up. I _thought_ this was the whole "PTE mapped THP" bit of code, like collapse_pte_mapped_thp(). But, looking at it again, I think that code is just for the huge tmpfs flavor of THP. Either way, I'm kinda surprised that we don't collapse things in place. Especially in the early days, there were lots of crazy things that split THPs. I think even things like /proc/$pid/smaps split them. In any case, it sounds like SEV-SNP users should probably be advised to use MADV_NOHUGEPAGE to avoid any future surprises. At least until the hardware folks get their act together and teach the TLB how to fracture 2M entries properly. :)