On Tue, May 23, 2023 at 9:44 PM Hugh Dickins <hughd@xxxxxxxxxx> wrote: > > On Mon, 22 May 2023, Yang Shi wrote: > > On Sun, May 21, 2023 at 10:24 PM Hugh Dickins <hughd@xxxxxxxxxx> wrote: > > > > > > __collapse_huge_page_swapin(): don't drop the map after every pte, it > > > only has to be dropped by do_swap_page(); give up if pte_offset_map() > > > fails; trace_mm_collapse_huge_page_swapin() at the end, with result; > > > fix comment on returned result; fix vmf.pgoff, though it's not used. > > > > > > collapse_huge_page(): use pte_offset_map_lock() on the _pmd returned > > > from clearing; allow failure, but it should be impossible there. > > > hpage_collapse_scan_pmd() and collapse_pte_mapped_thp() allow for > > > pte_offset_map_lock() failure. > > > > > > Signed-off-by: Hugh Dickins <hughd@xxxxxxxxxx> > > > > Reviewed-by: Yang Shi <shy828301@xxxxxxxxx> > > Thanks. > > > > > A nit below: > > > > > --- > > > mm/khugepaged.c | 72 +++++++++++++++++++++++++++++++++---------------- > > > 1 file changed, 49 insertions(+), 23 deletions(-) > > > > > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > > > index 732f9ac393fc..49cfa7cdfe93 100644 > > > --- a/mm/khugepaged.c > > > +++ b/mm/khugepaged.c > ... > > > @@ -1029,24 +1040,29 @@ static int __collapse_huge_page_swapin(struct mm_struct *mm, > > > * resulting in later failure. > > > */ > > > if (ret & VM_FAULT_RETRY) { > > > - trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0); > > > /* Likely, but not guaranteed, that page lock failed */ > > > - return SCAN_PAGE_LOCK; > > > + result = SCAN_PAGE_LOCK; > > > > With per-VMA lock, this may not be true anymore, at least not true > > until per-VMA lock supports swap fault. It may be better to have a > > more general failure code, for example, SCAN_FAIL. But anyway you > > don't have to change it in your patch, I can send a follow-up patch > > once this series is landed on mm-unstable. > > Interesting point (I've not tried to wrap my head around what differences > per-VMA locking would make to old likelihoods here), and thank you for > deferring a change on it - appreciated. > > Something to beware of, if you do choose to change it: mostly those > SCAN codes (I'm not a fan of them!) are only for a tracepoint somewhere, > but madvise_collapse() and madvise_collapse_errno() take some of them > more seriously than others - I think SCAN_PAGE_LOCK ends up as an > EAGAIN (rightly), but SCAN_FAIL as an EINVAL (depends). > > But maybe there are layers in between which do not propagate the result > code, I didn't check. All in all, not something I'd spend time on myself. Thanks, Hugh. A second look shows do_swap_page() should not return VM_FAULT_RETRY due to per-VMA lock since it depends on FAULT_FLAG_VMA_LOCK flag, but it is actually not set in khugepaged path. Khugepaged just has FAULT_FLAG_ALLOW_RETRY flag set. So we don't have to change anything. > > Hugh