On Mon, 21 Aug 2017 23:08:18 +0800 Chen Yu <yu.c.chen@xxxxxxxxx> wrote: > There is a problem that when counting the pages for creating > the hibernation snapshot will take significant amount of > time, especially on system with large memory. Since the counting > job is performed with irq disabled, this might lead to NMI lockup. > The following warning were found on a system with 1.5TB DRAM: > > ... > > It has taken nearly 20 seconds(2.10GHz CPU) thus the NMI lockup > was triggered. In case the timeout of the NMI watch dog has been > set to 1 second, a safe interval should be 6590003/20 = 320k pages > in theory. However there might also be some platforms running at a > lower frequency, so feed the watchdog every 100k pages. > > ... > > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -2531,9 +2532,12 @@ void drain_all_pages(struct zone *zone) > > #ifdef CONFIG_HIBERNATION > > +/* Touch watchdog for every WD_INTERVAL_PAGE pages. */ > +#define WD_INTERVAL_PAGE (100*1024) > + > void mark_free_pages(struct zone *zone) > { > - unsigned long pfn, max_zone_pfn; > + unsigned long pfn, max_zone_pfn, page_num = 0; > unsigned long flags; > unsigned int order, t; > struct page *page; > @@ -2548,6 +2552,9 @@ void mark_free_pages(struct zone *zone) > if (pfn_valid(pfn)) { > page = pfn_to_page(pfn); > > + if (!((page_num++) % WD_INTERVAL_PAGE)) > + touch_nmi_watchdog(); > + > if (page_zone(page) != zone) > continue; > > @@ -2561,8 +2568,11 @@ void mark_free_pages(struct zone *zone) > unsigned long i; > > pfn = page_to_pfn(page); > - for (i = 0; i < (1UL << order); i++) > + for (i = 0; i < (1UL << order); i++) { > + if (!((page_num++) % WD_INTERVAL_PAGE)) > + touch_nmi_watchdog(); > swsusp_set_page_free(pfn_to_page(pfn + i)); > + } > } > } > spin_unlock_irqrestore(&zone->lock, flags); hm, is it really worth all the WD_INTERVAL_PAGE stuff? touch_nmi_watchdog() is pretty efficient and calling it once-per-page may not have a measurable effect. And if we're really concerned about the performance impact it would be better to make WD_INTERVAL_PAGE a power of 2 (128*1024?) to avoid the modulus operation. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>