On Tue, Mar 09, 2021 at 10:32:51AM +0100, Michal Hocko wrote: > On Mon 08-03-21 12:20:47, Minchan Kim wrote: > > alloc_contig_range is usually used on cma area or movable zone. > > It's critical if the page migration fails on those areas so > > dump more debugging message. > > I disagree with this statement. alloc_contig_range is not a reliable > allocator. Any user, be it CMA or direct users of alloc_contig_range > have to deal with allocation failures. Debugging information can be > still useful but considering migration failures critical is > overstatement to say the least. Fair enough. Let's change it. "Currently, debugging CMA allocation failure is too hard due to lacking of page information. alloc_contig_range is proper place to dump them since it has migrate-failed page list." > > > page refcount, mapcount with page flags on dump_page are > > helpful information to deduce the culprit. Furthermore, > > dump_page_owner was super helpful to find long term pinner > > who initiated the page allocation. > > > > Admin could enable the dump like this(by default, disabled) > > > > echo "func dump_migrate_failure_pages +p" > control > > > > Admin could disable it. > > > > echo "func dump_migrate_failure_pages =_" > control > > My original idea was to add few pr_debug and -DDYNAMIC_DEBUG_MODULE for > page_alloc.c. It makes sense to enable a whole bunch at once though. > The naming should better reflect this is alloc_contig_rage related > because the above sounds like a generic migration failure thing. alloc_contig_dump_pages? > > Somebody more familiar with the dynamic debugging infrastructure needs > to have a look but from from a quick look it seems ok. > > Do we really need all the ugly ifdefery, though? Don't we want to have > this compiled in all the time and just rely on the static branch managed > by the dynamic debugging framework? I have no further idea to make it simple while we keep the flexibility for arguments and print format. #if defined(CONFIG_DYNAMIC_DEBUG) || \ (defined(CONFIG_DYNAMIC_DEBUG_CORE) && defined(DYNAMIC_DEBUG_MODULE)) static void alloc_contig_dump_pages(struct list_head *page_list) { static DEFINE_RATELIMIT_STATE(_rs, DEFAULT_RATELIMIT_INTERVAL, DEFAULT_RATELIMIT_BURST); DEFINE_DYNAMIC_DEBUG_METADATA(descriptor, "migrate failure"); if (DYNAMIC_DEBUG_BRANCH(descriptor) && __ratelimit(&_rs)) { struct page *page; WARN(1, "failed callstack"); list_for_each_entry(page, page_list, lru) dump_page(page, "migration failure"); } } #else static inline void alloc_contig_dump_pages(struct list_head *page_list) { } #endif > > [...] > > +void dump_migrate_failure_pages(struct list_head *page_list) > > +{ > > + DEFINE_DYNAMIC_DEBUG_METADATA(descriptor, > > + "migrate failure"); > > + if (DYNAMIC_DEBUG_BRANCH(descriptor) && > > + alloc_contig_ratelimit()) { > > + struct page *page; > > + > > + WARN(1, "failed callstack"); > > + list_for_each_entry(page, page_list, lru) > > + dump_page(page, "migration failure"); > > + } > > Apart from the above, do we have to warn for something that is a > debugging aid? A similar concern wrt dump_page which uses pr_warn and Make sense. > page owner is using even pr_alert. > Would it make sense to add a loglevel parameter both into __dump_page > and dump_page_owner? Let me try it.