On Thu, Aug 10, 2023 at 06:38:29PM +0800, Guo Hui wrote: > In the function __nr_to_section, > Use shift operation instead of division operation > in order to improve the performance of memory management. > There are no functional changes. > > Some performance data is as follows: > Machine configuration: Hygon 128 cores, 256M memory > > Stream single core: > with patch without patch promote > Copy 23376.7731 23907.1532 -1.27% > Scale 12580.2913 11679.7852 +7.71% > Add 11922.9562 11461.8669 +4.02% > Triad 12549.2735 11491.9798 +9.20% How stable are these numbers? Because this patch makes no sense to me. #define SECTION_NR_TO_ROOT(sec) ((sec) / SECTIONS_PER_ROOT) with: #ifdef CONFIG_SPARSEMEM_EXTREME #define SECTIONS_PER_ROOT (PAGE_SIZE / sizeof (struct mem_section)) #else #define SECTIONS_PER_ROOT 1 #endif sizeof(struct mem_section) is a constant power-of-two. So if this result is real, then GCC isn't able to turn a divide-by-a-constant-power-of-two into a shift. That seems _really_ unlikely to me. And if that is what's going on, then that needs to be fixed! Can you examine some before-and-after assembly dumps to see if that is what's going on? > Signed-off-by: Guo Hui <guohui@xxxxxxxxxxxxx> > --- > include/linux/mmzone.h | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h > index 5e50b78d58ea..8dde6fb56109 100644 > --- a/include/linux/mmzone.h > +++ b/include/linux/mmzone.h > @@ -1818,7 +1818,8 @@ struct mem_section { > #define SECTIONS_PER_ROOT 1 > #endif > > -#define SECTION_NR_TO_ROOT(sec) ((sec) / SECTIONS_PER_ROOT) > +#define SECTION_ROOT_SHIFT (__builtin_popcount(SECTIONS_PER_ROOT - 1)) > +#define SECTION_NR_TO_ROOT(sec) ((sec) >> SECTION_ROOT_SHIFT) > #define NR_SECTION_ROOTS DIV_ROUND_UP(NR_MEM_SECTIONS, SECTIONS_PER_ROOT) > #define SECTION_ROOT_MASK (SECTIONS_PER_ROOT - 1) > > -- > 2.20.1 > >