>> >> reclaim of the task in do_try_to_free_pages(). In systems with NUMA >> >> open, some tasks occasionally experience slower response times, but the >> >> total count of reclaim does not increase, using ftrace can show that >> >> node_reclaim has occurred. >> >> >> >> The memory reclaim occurring in get_page_from_freelist() is also due to >> >> heavy memory load. To get the impact of tasks in memory reclaim, this >> >> patch adds the statistics of the memory reclaim delay statistics for >> >> __node_reclaim(). >> >> >> >> ... >> >> >> >> --- a/mm/vmscan.c >> >> +++ b/mm/vmscan.c >> >> @@ -8010,6 +8010,7 @@ static int __node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned in >> >> >> >> cond_resched(); >> >> psi_memstall_enter(&pflags); >> >> + delayacct_freepages_start(); >> >> fs_reclaim_acquire(sc.gfp_mask); >> >> /* >> >> * We need to be able to allocate from the reserves for RECLAIM_UNMAP >> >> @@ -8032,6 +8033,7 @@ static int __node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned in >> >> memalloc_noreclaim_restore(noreclaim_flag); >> >> fs_reclaim_release(sc.gfp_mask); >> >> psi_memstall_leave(&pflags); >> >> + delayacct_freepages_end(); >> >> >> >> trace_mm_vmscan_node_reclaim_end(sc.nr_reclaimed); >> > >> > __node_reclaim() calls shrink_node() which at some point will call >> > do_try_to_free_pages() (yes?), which calls delayacct_freepages_start(). >> > >> > So we're effectively nesting calls to delayacct_freepages_start(), >> > which isn't designed for that? >> > >> sorry, the last reply was a mistake. >> >> It seems that no point in shrink_node() will call do_try_to_free_pages(). >> And do_try_to_free_pages() will call shrink_node() through shrink_zones(), >> if shrink_node() also has some point will call do_try_to_free_pages,then >> delayacct_freepages_start() is nested now? > > That's because shrink_node() goes through shrink_list() via > shrink_lruvec()? do_try_to_free_pages() will call shrink_node(). Ideally > we should have some counters around __node_reclaim() and balance_pgdat() > like psi_memstall_* does. Do you want to mimic what psi_memstall_* does? > This would change the definition of delayacct free pages, but I don't think > it will make it worse. > > Balbir Singh The focus of delayacct should be the memory recalim delay statistics for each task, and there should be only few direct connections with shrink_node()? At least it seems like the using of delayacct_freepages_start() is not wrong right now, so there is unnecessary to implement a new counting method? Compared with the delay statistics of balance_pgdat() for kswapd, is it more meaningful to keep the definition of delayacct free pages and only statistics for application? Keep the definition of delayacct free pages, going back to this simple patch, it only does one very simple thing, counting the memory reclaim delay due to memory pressure on the memory allocation path of application. Currently only measure the memory recalim delay in do_try_to_free_pages(), this patch adds statistical points in __node_reclaim(), both do_try_to_free_pages() and __node_reclaim() will call shrink_node(). WenYu 本?件??指定收件人使用并可能包含保密信息,若??收到本?件,敬?通知?件人,并立即?除本?件及所有副本。?不得擅自?播、??、保存或?制此?件(含附件)。感??的理解与配合。 This message may contain confidential information, and is intended only for the use of the addressee(s) named above. If you have received this message in error, please contact the sender immediately and delete all copies from your system. You are hereby notified that any dissemination, distribution, preservation or copying of this message and/or attachments is strictly prohibited. Thank you for your understanding and cooperation.