On 16 Mar 2023, at 19:21, Kirill A. Shutemov wrote: > On Thu, Mar 16, 2023 at 01:09:30PM -0400, Zi Yan wrote: >>> diff --git a/Documentation/admin-guide/kdump/vmcoreinfo.rst b/Documentation/admin-guide/kdump/vmcoreinfo.rst >>> index 86fd88492870..c267b8c61e97 100644 >>> --- a/Documentation/admin-guide/kdump/vmcoreinfo.rst >>> +++ b/Documentation/admin-guide/kdump/vmcoreinfo.rst >>> @@ -172,7 +172,7 @@ variables. >>> Offset of the free_list's member. This value is used to compute the number >>> of free pages. >>> >>> -Each zone has a free_area structure array called free_area[MAX_ORDER]. >>> +Each zone has a free_area structure array called free_area[MAX_ORDER + 1]. >>> The free_list represents a linked list of free page blocks. >>> >>> (list_head, next|prev) >> >> In vmcoreinfo.rst, line 192: >> >> - (zone.free_area, MAX_ORDER) >> + (zone.free_area, MAX_ORDER + 1) > > Okay. > >>> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt >>> index 6221a1d057dd..50da4f26fad5 100644 >>> --- a/Documentation/admin-guide/kernel-parameters.txt >>> +++ b/Documentation/admin-guide/kernel-parameters.txt >>> @@ -3969,7 +3969,7 @@ >>> [KNL] Minimal page reporting order >>> Format: <integer> >>> Adjust the minimal page reporting order. The page >>> - reporting is disabled when it exceeds (MAX_ORDER-1). >>> + reporting is disabled when it exceeds MAX_ORDER. >>> >>> panic= [KNL] Kernel behaviour on panic: delay <timeout> >>> timeout > 0: seconds before rebooting >> >> line 942: >> - possible value is MAX_ORDER/2. Setting this parameter >> + possible value is (MAX_ORDER + 1)/2. Setting this parameter >> > > I don't think it worth it. See below, on the relevant code change. > >>> diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c >>> index d6bbdb7830b2..273a0fe7910a 100644 >>> --- a/kernel/events/ring_buffer.c >>> +++ b/kernel/events/ring_buffer.c >>> @@ -609,8 +609,8 @@ static struct page *rb_alloc_aux_page(int node, int order) >>> { >>> struct page *page; >>> >>> - if (order >= MAX_ORDER) >>> - order = MAX_ORDER - 1; >>> + if (order > MAX_ORDER) >>> + order = MAX_ORDER; >>> >>> do { >>> page = alloc_pages_node(node, PERF_AUX_GFP, order); >> >> line 817: >> >> - if (order_base_2(size) >= PAGE_SHIFT+MAX_ORDER) >> + if (order_base_2(size) > PAGE_SHIFT+MAX_ORDER) > > Right. > >>> diff --git a/mm/Kconfig b/mm/Kconfig >>> index 4751031f3f05..fc059969d7ba 100644 >>> --- a/mm/Kconfig >>> +++ b/mm/Kconfig >>> @@ -346,9 +346,9 @@ config SHUFFLE_PAGE_ALLOCATOR >>> the presence of a memory-side-cache. There are also incidental >>> security benefits as it reduces the predictability of page >>> allocations to compliment SLAB_FREELIST_RANDOM, but the >>> - default granularity of shuffling on the "MAX_ORDER - 1" i.e, >>> - 10th order of pages is selected based on cache utilization >>> - benefits on x86. >>> + default granularity of shuffling on the MAX_ORDER i.e, 10th >>> + order of pages is selected based on cache utilization benefits >>> + on x86. >>> >>> While the randomization improves cache utilization it may >>> negatively impact workloads on platforms without a cache. For >> >> line 669: >> >> - Note that the pageblock_order cannot exceed MAX_ORDER - 1 and will be >> - clamped down to MAX_ORDER - 1. >> + Note that the pageblock_order cannot exceed MAX_ORDER and will be >> + clamped down to MAX_ORDER. >> > > Okay. Missed that. > >>> diff --git a/mm/kmsan/init.c b/mm/kmsan/init.c >>> index 7fb794242fad..ffedf4dbc49d 100644 >>> --- a/mm/kmsan/init.c >>> +++ b/mm/kmsan/init.c >>> @@ -96,7 +96,7 @@ void __init kmsan_init_shadow(void) >>> struct metadata_page_pair { >>> struct page *shadow, *origin; >>> }; >>> -static struct metadata_page_pair held_back[MAX_ORDER] __initdata; >>> +static struct metadata_page_pair held_back[MAX_ORDER + 1] __initdata; >>> >>> /* >>> * Eager metadata allocation. When the memblock allocator is freeing pages to >> >> line 144: this one I am not sure if the original code is wrong or not. >> >> - .order = MAX_ORDER, >> + .order = MAX_ORDER + 1, > > I think the original code is wrong, but the initialization seems unused: > it got overridden in kmsan_memblock_discard() before the first use. > >>> @@ -211,8 +211,8 @@ static void kmsan_memblock_discard(void) >>> * order=N-1, >>> * - repeat. >>> */ >>> - collect.order = MAX_ORDER - 1; >>> - for (int i = MAX_ORDER - 1; i >= 0; i--) { >>> + collect.order = MAX_ORDER; >>> + for (int i = MAX_ORDER; i >= 0; i--) { >>> if (held_back[i].shadow) >>> smallstack_push(&collect, held_back[i].shadow); >>> if (held_back[i].origin) >>> diff --git a/mm/memblock.c b/mm/memblock.c >>> index 25fd0626a9e7..338b8cb0793e 100644 >>> --- a/mm/memblock.c >>> +++ b/mm/memblock.c >>> @@ -2043,7 +2043,7 @@ static void __init __free_pages_memory(unsigned long start, unsigned long end) >>> int order; >>> >>> while (start < end) { >>> - order = min(MAX_ORDER - 1UL, __ffs(start)); >>> + order = min(MAX_ORDER, __ffs(start)); >> >> while you are here, maybe using min_t is better. >> >> order = min_t(unsigned long, MAX_ORDER, __ffs(start)); > > Already addressed by fixup. > >>> >>> while (start + (1UL << order) > end) >>> order--; >>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c >>> index db3b270254f1..86291c79a764 100644 >>> --- a/mm/memory_hotplug.c >>> +++ b/mm/memory_hotplug.c >>> @@ -596,7 +596,7 @@ static void online_pages_range(unsigned long start_pfn, unsigned long nr_pages) >>> unsigned long pfn; >>> >>> /* >>> - * Online the pages in MAX_ORDER - 1 aligned chunks. The callback might >>> + * Online the pages in MAX_ORDER aligned chunks. The callback might >>> * decide to not expose all pages to the buddy (e.g., expose them >>> * later). We account all pages as being online and belonging to this >>> * zone ("present"). >>> @@ -605,7 +605,7 @@ static void online_pages_range(unsigned long start_pfn, unsigned long nr_pages) >>> * this and the first chunk to online will be pageblock_nr_pages. >>> */ >>> for (pfn = start_pfn; pfn < end_pfn;) { >>> - int order = min(MAX_ORDER - 1UL, __ffs(pfn)); >>> + int order = min(MAX_ORDER, __ffs(pfn)); >> >> ditto >> >> int order = min_t(unsigned long, MAX_ORDER, __ffs(pfn)); > > Ditto. > >>> >>> (*online_page_callback)(pfn_to_page(pfn), order); >>> pfn += (1UL << order); >>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >>> index ac1fc986af44..66700f27b4c6 100644 >>> --- a/mm/page_alloc.c >>> +++ b/mm/page_alloc.c >> >> line 842: it might make a difference when MAX_ORDER is odd. >> >> - if (kstrtoul(buf, 10, &res) < 0 || res > MAX_ORDER / 2) { >> + if (kstrtoul(buf, 10, &res) < 0 || res > (MAX_ORDER + 1) / 2) { > > I don't think it worth the complication: the upper limit here is pretty > arbitrary and +1 doesn't really make a difference. I would rather keep it > simple. > >>> diff --git a/mm/slub.c b/mm/slub.c >>> index 32eb6b50fe18..0e19c0d647e6 100644 >>> --- a/mm/slub.c >>> +++ b/mm/slub.c >>> @@ -4171,8 +4171,8 @@ static inline int calculate_order(unsigned int size) >>> /* >>> * Doh this slab cannot be placed using slub_max_order. >>> */ >>> - order = calc_slab_order(size, 1, MAX_ORDER - 1, 1); >>> - if (order < MAX_ORDER) >>> + order = calc_slab_order(size, 1, MAX_ORDER, 1); >>> + if (order <= MAX_ORDER) >>> return order; >>> return -ENOSYS; >>> } >>> @@ -4697,7 +4697,7 @@ __setup("slub_min_order=", setup_slub_min_order); >>> static int __init setup_slub_max_order(char *str) >>> { >>> get_option(&str, (int *)&slub_max_order); >>> - slub_max_order = min(slub_max_order, (unsigned int)MAX_ORDER - 1); >>> + slub_max_order = min(slub_max_order, (unsigned int)MAX_ORDER); >> >> maybe min_t is better? >> >> slub_max_order = min_t(unsigned int, slub_max_order, MAX_ORDER); > > Fair enough. > > ... > >> The changes look good to me. I added some missing changes inline, although the line >> number might not be exact. Feel free to add Reviewed-by: Zi Yan <ziy@xxxxxxxxxx>. >> >> Do you think it is worth adding a MAX_ORDER check in checkpatch.pl to warn people >> the meaning of MAX_ORDER has changed? Something like: >> >> # check for MAX_ORDER uses as its semantics has changed. >> # MAX_ORDER now really means the max order of a page that can come out of >> # kernel buddy allocator >> if ($line =~ /MAX_ORDER/) { >> WARN("MAX_ORDER", >> "MAX_ORDER has changed its semantics. The max order of a page that can be allocated from buddy allocator is MAX_ORDER instead of MAX_ORDER - 1.") >> } >> > > We can add, if you think it is helpful. I don't feel strongly about this. > > Below is fixup I made based on your feedback: > > diff --git a/Documentation/admin-guide/kdump/vmcoreinfo.rst b/Documentation/admin-guide/kdump/vmcoreinfo.rst > index c267b8c61e97..e488bb4e13c4 100644 > --- a/Documentation/admin-guide/kdump/vmcoreinfo.rst > +++ b/Documentation/admin-guide/kdump/vmcoreinfo.rst > @@ -189,7 +189,7 @@ Offsets of the vmap_area's members. They carry vmalloc-specific > information. Makedumpfile gets the start address of the vmalloc region > from this. > > -(zone.free_area, MAX_ORDER) > +(zone.free_area, MAX_ORDER + 1) > --------------------------- > > Free areas descriptor. User-space tools use this value to iterate the > diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c > index 273a0fe7910a..a0433f37b024 100644 > --- a/kernel/events/ring_buffer.c > +++ b/kernel/events/ring_buffer.c > @@ -814,7 +814,7 @@ struct perf_buffer *rb_alloc(int nr_pages, long watermark, int cpu, int flags) > size = sizeof(struct perf_buffer); > size += nr_pages * sizeof(void *); > > - if (order_base_2(size) >= PAGE_SHIFT+MAX_ORDER) > + if (order_base_2(size) > PAGE_SHIFT+MAX_ORDER) > goto fail; > > node = (cpu == -1) ? cpu : cpu_to_node(cpu); > diff --git a/mm/Kconfig b/mm/Kconfig > index 467844de48e5..6ee3b48ed298 100644 > --- a/mm/Kconfig > +++ b/mm/Kconfig > @@ -666,8 +666,8 @@ config HUGETLB_PAGE_SIZE_VARIABLE > HUGETLB_PAGE_ORDER when there are multiple HugeTLB page sizes available > on a platform. > > - Note that the pageblock_order cannot exceed MAX_ORDER - 1 and will be > - clamped down to MAX_ORDER - 1. > + Note that the pageblock_order cannot exceed MAX_ORDER and will be > + clamped down to MAX_ORDER. > > config CONTIG_ALLOC > def_bool (MEMORY_ISOLATION && COMPACTION) || CMA > diff --git a/mm/slub.c b/mm/slub.c > index 0e19c0d647e6..f49d669ff604 100644 > --- a/mm/slub.c > +++ b/mm/slub.c > @@ -4697,7 +4697,7 @@ __setup("slub_min_order=", setup_slub_min_order); > static int __init setup_slub_max_order(char *str) > { > get_option(&str, (int *)&slub_max_order); > - slub_max_order = min(slub_max_order, (unsigned int)MAX_ORDER); > + slub_max_order = min_t(unsigned int, slub_max_order, MAX_ORDER); > > return 1; > } > -- > Kiryl Shutsemau / Kirill A. Shutemov LGTM. Thanks. Reviewed-by: Zi Yan <ziy@xxxxxxxxxx> -- Best Regards, Yan, Zi
Attachment:
signature.asc
Description: OpenPGP digital signature