On Tue, Sep 22, 2020 at 12:11:29AM -0700, John Hubbard wrote: > On 9/21/20 2:17 PM, Peter Xu wrote: > > There's one special path for copy_one_pte() with swap entries, in which > > add_swap_count_continuation(GFP_ATOMIC) might fail. In that case we'll return > > I might be looking at the wrong place, but the existing code seems to call > add_swap_count_continuation(GFP_KERNEL), not with GFP_ATOMIC? Ah, I wanted to reference the one in swap_duplicate(). > > > the swp_entry_t so that the caller will release the locks and redo the same > > thing with GFP_KERNEL. > > > > It's confusing when copy_one_pte() must return a swp_entry_t (even if all the > > ptes are non-swap entries). More importantly, we face other requirement to > > extend this "we need to do something else, but without the locks held" case. > > > > Rework the return value into something easier to understand, as defined in enum > > copy_mm_ret. We'll pass the swp_entry_t back using the newly introduced union > > I like the documentation here, but it doesn't match what you did in the patch. > Actually, the documentation had the right idea (enum, rather than #define, for > COPY_MM_* items). Below... Yeah actually my very initial version has it as an enum, then I changed it to macros because I started to want it return negative as errors. However funnily in the current version copy_one_pte() won't return an error anymore... So probably, yes, it should be a good idea to get the enum back. Also we should be able to drop the negative handling too with copy_ret, though it should be in the next patch. > > > copy_mm_data parameter. > > > > Another trivial change is to move the reset of the "progress" counter into the > > retry path, so that we'll reset it for other reasons too. > > > > This should prepare us with adding new return codes, very soon. > > > > Signed-off-by: Peter Xu <peterx@xxxxxxxxxx> > > --- > > mm/memory.c | 42 +++++++++++++++++++++++++++++------------- > > 1 file changed, 29 insertions(+), 13 deletions(-) > > > > diff --git a/mm/memory.c b/mm/memory.c > > index 7525147908c4..1530bb1070f4 100644 > > --- a/mm/memory.c > > +++ b/mm/memory.c > > @@ -689,16 +689,24 @@ struct page *vm_normal_page_pmd(struct vm_area_struct *vma, unsigned long addr, > > } > > #endif > > +#define COPY_MM_DONE 0 > > +#define COPY_MM_SWAP_CONT 1 > > Those should be enums, so as to get a little type safety and other goodness from > using non-macro items. > > ... > > @@ -866,13 +877,18 @@ static int copy_pte_range(struct mm_struct *dst_mm, struct mm_struct *src_mm, > > pte_unmap_unlock(orig_dst_pte, dst_ptl); > > cond_resched(); > > - if (entry.val) { > > - if (add_swap_count_continuation(entry, GFP_KERNEL) < 0) > > + switch (copy_ret) { > > + case COPY_MM_SWAP_CONT: > > + if (add_swap_count_continuation(data.entry, GFP_KERNEL) < 0) > > return -ENOMEM; > > - progress = 0; > > Yes. Definitely a little cleaner to reset this above, instead of here. > > > + break; > > + default: > > + break; > > I assume this no-op noise is to placate the compiler and/or static checkers. :) This is (so far) for COPY_MM_DONE. I normally will cover all cases in a "switch()" and here "default" is for it. Even if I covered all the possibilities, I may still tend to keep one "default" and a WARN_ON_ONCE(1) to make sure nothing I've missed. Not sure whether that's the ideal way, though. Thanks, -- Peter Xu