On Tue, 23 Mar 2021 10:44:21 +0000 Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> wrote: > On Mon, Mar 22, 2021 at 09:18:42AM +0000, Mel Gorman wrote: > > This series is based on top of Matthew Wilcox's series "Rationalise > > __alloc_pages wrapper" and does not apply to 5.12-rc2. If you want to > > test and are not using Andrew's tree as a baseline, I suggest using the > > following git tree > > > > git://git.kernel.org/pub/scm/linux/kernel/git/mel/linux.git mm-bulk-rebase-v5r9 > > > > Jesper and Chuck, would you mind rebasing on top of the following branch > please? > > git://git.kernel.org/pub/scm/linux/kernel/git/mel/linux.git mm-bulk-rebase-v6r2 > > The interface is the same so the rebase should be trivial. > > Jesper, I'm hoping you see no differences in performance but it's best > to check. I will rebase and check again. The current performance tests that I'm running, I observe that the compiler layout the code in unfortunate ways, which cause I-cache performance issues. I wonder if you could integrate below patch with your patchset? (just squash it) -- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer [PATCH] mm: optimize code layout for __alloc_pages_bulk From: Jesper Dangaard Brouer <brouer@xxxxxxxxxx> Looking at perf-report and ASM-code for __alloc_pages_bulk() then the code activated is suboptimal. The compiler guess wrong and place unlikely code in the beginning. Due to the use of WARN_ON_ONCE() macro the UD2 asm instruction is added to the code, which confuse the I-cache prefetcher in the CPU Signed-off-by: Jesper Dangaard Brouer <brouer@xxxxxxxxxx> --- mm/page_alloc.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index f60f51a97a7b..88a5c1ce5b87 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5003,10 +5003,10 @@ int __alloc_pages_bulk(gfp_t gfp, int preferred_nid, unsigned int alloc_flags; int nr_populated = 0, prep_index = 0; - if (WARN_ON_ONCE(nr_pages <= 0)) + if (unlikely(nr_pages <= 0)) return 0; - if (WARN_ON_ONCE(page_list && !list_empty(page_list))) + if (unlikely(page_list && !list_empty(page_list))) return 0; /* Skip populated array elements. */ @@ -5018,7 +5018,7 @@ int __alloc_pages_bulk(gfp_t gfp, int preferred_nid, prep_index = nr_populated; } - if (nr_pages == 1) + if (unlikely(nr_pages == 1)) goto failed; /* May set ALLOC_NOFRAGMENT, fragmentation will return 1 page. */ @@ -5054,7 +5054,7 @@ int __alloc_pages_bulk(gfp_t gfp, int preferred_nid, * If there are no allowed local zones that meets the watermarks then * try to allocate a single page and reclaim if necessary. */ - if (!zone) + if (unlikely(!zone)) goto failed; /* Attempt the batch allocation */ @@ -5075,7 +5075,7 @@ int __alloc_pages_bulk(gfp_t gfp, int preferred_nid, page = __rmqueue_pcplist(zone, ac.migratetype, alloc_flags, pcp, pcp_list); - if (!page) { + if (unlikely(!page)) { /* Try and get at least one page */ if (!nr_populated) goto failed_irq;