On Thu, Mar 04, 2021 at 10:11:35AM -0800, Minchan Kim wrote: > On Thu, Mar 04, 2021 at 06:23:09PM +0100, David Hildenbrand wrote: > > > > You want to debug something, so you try triggering it and capturing debug > > > > data. There are not that many alloc_contig_range() users such that this > > > > would really be an issue to isolate ... > > > > > > cma_alloc uses alloc_contig_range and cma_alloc has lots of users. > > > Even, it is expoerted by dmabuf so any userspace would trigger the > > > allocation by their own. Some of them could be tolerant for the failure, > > > rest of them could be critical. We should't expect it by limited kernel > > > usecase. > > > > Assume you are debugging allocation failures. You either collect the data > > yourself or ask someone to send you that output. You care about any > > alloc_contig_range() allocation failures that shouldn't happen, don't you? > > > > > > > > > > > > > Strictly speaking: any allocation failure on ZONE_MOVABLE or CMA is > > > > problematic (putting aside NORETRY logic and similar aside). So any such > > > > page you hit is worth investigating and, therefore, worth getting logged for > > > > debugging purposes. > > > > > > If you believe the every alloc_contig_range failure is problematic > > > > Every one where we should have guarantees I guess: ZONE_MOVABLE or > > MIGRAT_CMA. On ZONE_NORMAL, there are no guarantees. > > Indeed. How about this? diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 238d0fc232aa..489e557b9390 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -8481,7 +8481,8 @@ static inline void dump_migrate_failure_pages(struct list_head *page_list) /* [start, end) must belong to a single zone. */ static int __alloc_contig_migrate_range(struct compact_control *cc, - unsigned long start, unsigned long end) + unsigned long start, unsigned long end, + bool nofail) { /* This function is based on compact_zone() from compaction.c. */ unsigned int nr_reclaimed; @@ -8522,7 +8523,8 @@ static int __alloc_contig_migrate_range(struct compact_control *cc, NULL, (unsigned long)&mtc, cc->mode, MR_CONTIG_RANGE); } if (ret < 0) { - dump_migrate_failure_pages(&cc->migratepages); + if (ret == -EBUSY && nofail) + dump_migrate_failure_pages(&cc->migratepages); putback_movable_pages(&cc->migratepages); return ret; } @@ -8610,7 +8612,9 @@ int alloc_contig_range(unsigned long start, unsigned long end, * allocated. So, if we fall through be sure to clear ret so that * -EBUSY is not accidentally used or returned to caller. */ - ret = __alloc_contig_migrate_range(&cc, start, end); + ret = __alloc_contig_migrate_range(&cc, start, end, + migratetype == CMA || + zone_idx(cc.zone) == ZONE_MOVABLE); if (ret && ret != -EBUSY) goto done; ret =0;