2020년 6월 29일 (월) 오후 4:55, Michal Hocko <mhocko@xxxxxxxxxx>님이 작성: > > On Mon 29-06-20 15:27:25, Joonsoo Kim wrote: > [...] > > Solution that Introduces a new > > argument doesn't cause this problem while avoiding CMA regions. > > My primary argument is that there is no real reason to treat hugetlb > dequeing somehow differently. So if we simply exclude __GFP_MOVABLE for > _any_ other allocation then this certainly has some drawbacks on the > usable memory for the migration target and it can lead to allocation > failures (especially on movable_node setups where the amount of movable > memory might be really high) and therefore longterm gup failures. And > yes those failures might be premature. But my point is that the behavior > would be _consistent_. So a user wouldn't see random failures for some > types of pages while a success for others. Hmm... I don't agree with your argument. Excluding __GFP_MOVABLE is a *work-around* way to exclude CMA regions. Implementation for dequeuing in this patch is a right way to exclude CMA regions. Why do we use a work-around for this case? To be consistent is important but it's only meaningful if it is correct. It should not disrupt to make a better code. And, dequeing is already a special process that is only available for hugetlb. I think that using different (correct) implementations there doesn't break any consistency. > Let's have a look at this patch. It is simply working that around the > restriction for a very limited types of pages - only hugetlb pages > which have reserves in non-cma movable pools. I would claim that many > setups will simply not have many (if any) spare hugetlb pages in the > pool except for temporary time periods when a workload is (re)starting > because this would be effectively a wasted memory. This can not be a stopper to make the correct code. > The patch is adding a special case flag to claim what the code already > does by memalloc_nocma_{save,restore} API so the information is already > there. Sorry I didn't bring this up earlier but I have completely forgot > about its existence. With that one in place I do agree that dequeing > needs a fixup but that should be something like the following instead. Thanks for letting me know. I don't know it until now. It looks like it's better to use this API rather than introducing a new argument. > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index 57ece74e3aae..c1595b1d36f3 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -1092,10 +1092,14 @@ static struct page *dequeue_huge_page_nodemask(struct hstate *h, gfp_t gfp_mask, > /* Movability of hugepages depends on migration support. */ > static inline gfp_t htlb_alloc_mask(struct hstate *h) > { > + gfp_t gfp; > + > if (hugepage_movable_supported(h)) > - return GFP_HIGHUSER_MOVABLE; > + gfp = GFP_HIGHUSER_MOVABLE; > else > - return GFP_HIGHUSER; > + gfp = GFP_HIGHUSER; > + > + return current_gfp_context(gfp); > } > > static struct page *dequeue_huge_page_vma(struct hstate *h, > > If we even fix this general issue for other allocations and allow a > better CMA exclusion then it would be implemented consistently for > everybody. Yes, I have reviewed the memalloc_nocma_{} APIs and found the better way for CMA exclusion. I will do it after this patch is finished. > Does this make more sense to you are we still not on the same page wrt > to the actual problem? Yes, but we have different opinions about it. As said above, I will make a patch for better CMA exclusion after this patchset. It will make code consistent. I'd really appreciate it if you wait until then. Thanks.