On Mon, 22 May 2023, Yang Shi wrote: > On Sun, May 21, 2023 at 10:24 PM Hugh Dickins <hughd@xxxxxxxxxx> wrote: > > > > __collapse_huge_page_swapin(): don't drop the map after every pte, it > > only has to be dropped by do_swap_page(); give up if pte_offset_map() > > fails; trace_mm_collapse_huge_page_swapin() at the end, with result; > > fix comment on returned result; fix vmf.pgoff, though it's not used. > > > > collapse_huge_page(): use pte_offset_map_lock() on the _pmd returned > > from clearing; allow failure, but it should be impossible there. > > hpage_collapse_scan_pmd() and collapse_pte_mapped_thp() allow for > > pte_offset_map_lock() failure. > > > > Signed-off-by: Hugh Dickins <hughd@xxxxxxxxxx> > > Reviewed-by: Yang Shi <shy828301@xxxxxxxxx> Thanks. > > A nit below: > > > --- > > mm/khugepaged.c | 72 +++++++++++++++++++++++++++++++++---------------- > > 1 file changed, 49 insertions(+), 23 deletions(-) > > > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > > index 732f9ac393fc..49cfa7cdfe93 100644 > > --- a/mm/khugepaged.c > > +++ b/mm/khugepaged.c ... > > @@ -1029,24 +1040,29 @@ static int __collapse_huge_page_swapin(struct mm_struct *mm, > > * resulting in later failure. > > */ > > if (ret & VM_FAULT_RETRY) { > > - trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0); > > /* Likely, but not guaranteed, that page lock failed */ > > - return SCAN_PAGE_LOCK; > > + result = SCAN_PAGE_LOCK; > > With per-VMA lock, this may not be true anymore, at least not true > until per-VMA lock supports swap fault. It may be better to have a > more general failure code, for example, SCAN_FAIL. But anyway you > don't have to change it in your patch, I can send a follow-up patch > once this series is landed on mm-unstable. Interesting point (I've not tried to wrap my head around what differences per-VMA locking would make to old likelihoods here), and thank you for deferring a change on it - appreciated. Something to beware of, if you do choose to change it: mostly those SCAN codes (I'm not a fan of them!) are only for a tracepoint somewhere, but madvise_collapse() and madvise_collapse_errno() take some of them more seriously than others - I think SCAN_PAGE_LOCK ends up as an EAGAIN (rightly), but SCAN_FAIL as an EINVAL (depends). But maybe there are layers in between which do not propagate the result code, I didn't check. All in all, not something I'd spend time on myself. Hugh