On Wed, 2012-12-12 at 09:57 +0800, Jianguo Wu wrote: > On 2012/12/11 21:20, Simon Jeons wrote: > > > On Tue, 2012-12-11 at 20:41 +0800, Jianguo Wu wrote: > >> On 2012/12/11 20:24, Simon Jeons wrote: > >> > >>> On Tue, 2012-12-11 at 11:07 +0800, Jianguo Wu wrote: > >>>> On 2012/12/11 10:33, Tang Chen wrote: > >>>> > >>>>> This patch introduces a new array zone_movable_limit[] to store the > >>>>> ZONE_MOVABLE limit from movablecore_map boot option for all nodes. > >>>>> The function sanitize_zone_movable_limit() will find out to which > >>>>> node the ranges in movable_map.map[] belongs, and calculates the > >>>>> low boundary of ZONE_MOVABLE for each node. > >>>>> > >>>>> Signed-off-by: Tang Chen <tangchen@xxxxxxxxxxxxxx> > >>>>> Signed-off-by: Jiang Liu <jiang.liu@xxxxxxxxxx> > >>>>> Reviewed-by: Wen Congyang <wency@xxxxxxxxxxxxxx> > >>>>> Reviewed-by: Lai Jiangshan <laijs@xxxxxxxxxxxxxx> > >>>>> Tested-by: Lin Feng <linfeng@xxxxxxxxxxxxxx> > >>>>> --- > >>>>> mm/page_alloc.c | 77 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >>>>> 1 files changed, 77 insertions(+), 0 deletions(-) > >>>>> > >>>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c > >>>>> index 1c91d16..4853619 100644 > >>>>> --- a/mm/page_alloc.c > >>>>> +++ b/mm/page_alloc.c > >>>>> @@ -206,6 +206,7 @@ static unsigned long __meminitdata arch_zone_highest_possible_pfn[MAX_NR_ZONES]; > >>>>> static unsigned long __initdata required_kernelcore; > >>>>> static unsigned long __initdata required_movablecore; > >>>>> static unsigned long __meminitdata zone_movable_pfn[MAX_NUMNODES]; > >>>>> +static unsigned long __meminitdata zone_movable_limit[MAX_NUMNODES]; > >>>>> > >>>>> /* movable_zone is the "real" zone pages in ZONE_MOVABLE are taken from */ > >>>>> int movable_zone; > >>>>> @@ -4340,6 +4341,77 @@ static unsigned long __meminit zone_absent_pages_in_node(int nid, > >>>>> return __absent_pages_in_range(nid, zone_start_pfn, zone_end_pfn); > >>>>> } > >>>>> > >>>>> +/** > >>>>> + * sanitize_zone_movable_limit - Sanitize the zone_movable_limit array. > >>>>> + * > >>>>> + * zone_movable_limit is initialized as 0. This function will try to get > >>>>> + * the first ZONE_MOVABLE pfn of each node from movablecore_map, and > >>>>> + * assigne them to zone_movable_limit. > >>>>> + * zone_movable_limit[nid] == 0 means no limit for the node. > >>>>> + * > >>>>> + * Note: Each range is represented as [start_pfn, end_pfn) > >>>>> + */ > >>>>> +static void __meminit sanitize_zone_movable_limit(void) > >>>>> +{ > >>>>> + int map_pos = 0, i, nid; > >>>>> + unsigned long start_pfn, end_pfn; > >>>>> + > >>>>> + if (!movablecore_map.nr_map) > >>>>> + return; > >>>>> + > >>>>> + /* Iterate all ranges from minimum to maximum */ > >>>>> + for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, &nid) { > >>>>> + /* > >>>>> + * If we have found lowest pfn of ZONE_MOVABLE of the node > >>>>> + * specified by user, just go on to check next range. > >>>>> + */ > >>>>> + if (zone_movable_limit[nid]) > >>>>> + continue; > >>>>> + > >>>>> +#ifdef CONFIG_ZONE_DMA > >>>>> + /* Skip DMA memory. */ > >>>>> + if (start_pfn < arch_zone_highest_possible_pfn[ZONE_DMA]) > >>>>> + start_pfn = arch_zone_highest_possible_pfn[ZONE_DMA]; > >>>>> +#endif > >>>>> + > >>>>> +#ifdef CONFIG_ZONE_DMA32 > >>>>> + /* Skip DMA32 memory. */ > >>>>> + if (start_pfn < arch_zone_highest_possible_pfn[ZONE_DMA32]) > >>>>> + start_pfn = arch_zone_highest_possible_pfn[ZONE_DMA32]; > >>>>> +#endif > >>>>> + > >>>>> +#ifdef CONFIG_HIGHMEM > >>>>> + /* Skip lowmem if ZONE_MOVABLE is highmem. */ > >>>>> + if (zone_movable_is_highmem() && > >>>> > >>>> Hi Tang, > >>>> > >>>> I think zone_movable_is_highmem() is not work correctly here. > >>>> sanitize_zone_movable_limit > >>>> zone_movable_is_highmem <--using movable_zone here > >>>> find_zone_movable_pfns_for_nodes > >>>> find_usable_zone_for_movable <--movable_zone is specified here > >>>> > >>> > >>> Hi Jiangguo and Chen, > >>> > >>> - What's the meaning of zone_movable_is_highmem(), does it mean all zone > >>> highmem pages are zone movable pages or .... > >> > >> Hi Simon, > >> > >> zone_movable_is_highmem() means whether zone pages in ZONE_MOVABLE are taken from > >> highmem. > >> > >>> - dmesg > >>> > >>>> 0.000000] Zone ranges: > >>>> [ 0.000000] DMA [mem 0x00010000-0x00ffffff] > >>>> [ 0.000000] Normal [mem 0x01000000-0x373fdfff] > >>>> [ 0.000000] HighMem [mem 0x373fe000-0xb6cfffff] > >>>> [ 0.000000] Movable zone start for each node > >>>> [ 0.000000] Node 0: 0x97800000 > >>> > >>> Why the start of zone movable is in the range of zone highmem, if all > >>> the pages of zone movable are from zone highmem? If the answer is yes, > >> > >>> zone movable and zone highmem are in the equal status or not? > >> > >> The pages of zone_movable can be taken from zone_movalbe or zone_normal, > >> if we have highmem, then zone_movable will be taken from zone_highmem, > >> otherwise zone_movable will be taken from zone_normal. > >> > >> you can refer to find_usable_zone_for_movable(). > > > > Hi Jiangguo, > > > > I have 8G memory, movablecore=5G, but dmesg looks strange, what > > happended to me? > > > > Hi Simon, > > I think you used 32bit kernel, and didn't enable CONFIG_X86_PAE, right? > So, it can only address memory below 4G. Thanks for you response. Enable PAE on x86 32bit kernel, 8G memory, movablecore=6.5G [ 0.000000] 8304MB HIGHMEM available. [ 0.000000] 885MB LOWMEM available. [ 0.000000] mapped low ram: 0 - 375fe000 [ 0.000000] low ram: 0 - 375fe000 [ 0.000000] Zone ranges: [ 0.000000] DMA [mem 0x00010000-0x00ffffff] [ 0.000000] Normal [mem 0x01000000-0x375fdfff] [ 0.000000] HighMem [mem 0x375fe000-0x3e5fffff] [ 0.000000] Movable zone start for each node [ 0.000000] Early memory node ranges [ 0.000000] node 0: [mem 0x00010000-0x0009cfff] [ 0.000000] node 0: [mem 0x00100000-0x1fffffff] [ 0.000000] node 0: [mem 0x20200000-0x3fffffff] [ 0.000000] node 0: [mem 0x40200000-0xb69cbfff] [ 0.000000] node 0: [mem 0xb6a46000-0xb6a47fff] [ 0.000000] node 0: [mem 0xb6b1c000-0xb6cfffff] [ 0.000000] node 0: [mem 0x00000000-0x3e5fffff] [ 0.000000] On node 0 totalpages: 2051391 [ 0.000000] free_area_init_node: node 0, pgdat c0c26a80, node_mem_map f19de200 [ 0.000000] DMA zone: 32 pages used for memmap [ 0.000000] DMA zone: 0 pages reserved [ 0.000000] DMA zone: 3949 pages, LIFO batch:0 [ 0.000000] Normal zone: 1740 pages used for memmap [ 0.000000] Normal zone: 220466 pages, LIFO batch:31 [ 0.000000] HighMem zone: 16609 pages used for memmap [ 0.000000] HighMem zone: 1808595 pages, LIFO batch:31 Why zone movable disappear? > > Thanks, > Jianguo Wu > > >> [ 0.000000] Zone ranges: > >> [ 0.000000] DMA [mem 0x00010000-0x00ffffff] > >> [ 0.000000] Normal [mem 0x01000000-0x373fdfff] > >> [ 0.000000] HighMem [mem 0x373fe000-0xb6cfffff] > >> [ 0.000000] Movable zone start for each node > >> [ 0.000000] Node 0: 0xb7000000 > >> [ 0.000000] Early memory node ranges > >> [ 0.000000] node 0: [mem 0x00010000-0x0009cfff] > >> [ 0.000000] node 0: [mem 0x00100000-0x1fffffff] > >> [ 0.000000] node 0: [mem 0x20200000-0x3fffffff] > >> [ 0.000000] node 0: [mem 0x40200000-0xb69cbfff] > >> [ 0.000000] node 0: [mem 0xb6a46000-0xb6a47fff] > >> [ 0.000000] node 0: [mem 0xb6b1c000-0xb6cfffff] > >> [ 0.000000] On node 0 totalpages: 748095 > >> [ 0.000000] DMA zone: 32 pages used for memmap > >> [ 0.000000] DMA zone: 0 pages reserved > >> [ 0.000000] DMA zone: 3949 pages, LIFO batch:0 > >> [ 0.000000] Normal zone: 1736 pages used for memmap > >> [ 0.000000] Normal zone: 219958 pages, LIFO batch:31 > >> [ 0.000000] HighMem zone: 4083 pages used for memmap > >> [ 0.000000] HighMem zone: 517569 pages, LIFO batch:31 > >> [ 0.000000] Movable zone: 768 pages, LIFO batch:0 > > > >> > >> Thanks, > >> Jianguo Wu > >> > >>> > >>>> I think Jiang Liu's patch works fine for highmem, please refer to: > >>>> http://marc.info/?l=linux-mm&m=135476085816087&w=2 > >>>> > >>>> Thanks, > >>>> Jianguo Wu > >>>> > >>>>> + start_pfn < arch_zone_lowest_possible_pfn[ZONE_HIGHMEM]) > >>>>> + start_pfn = arch_zone_lowest_possible_pfn[ZONE_HIGHMEM]; > >>>>> +#endif > >>>>> + > >>>>> + if (start_pfn >= end_pfn) > >>>>> + continue; > >>>>> + > >>>>> + while (map_pos < movablecore_map.nr_map) { > >>>>> + if (end_pfn <= movablecore_map.map[map_pos].start_pfn) > >>>>> + break; > >>>>> + > >>>>> + if (start_pfn >= movablecore_map.map[map_pos].end_pfn) { > >>>>> + map_pos++; > >>>>> + continue; > >>>>> + } > >>>>> + > >>>>> + /* > >>>>> + * The start_pfn of ZONE_MOVABLE is either the minimum > >>>>> + * pfn specified by movablecore_map, or 0, which means > >>>>> + * the node has no ZONE_MOVABLE. > >>>>> + */ > >>>>> + zone_movable_limit[nid] = max(start_pfn, > >>>>> + movablecore_map.map[map_pos].start_pfn); > >>>>> + > >>>>> + break; > >>>>> + } > >>>>> + } > >>>>> +} > >>>>> + > >>>>> #else /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */ > >>>>> static inline unsigned long __meminit zone_spanned_pages_in_node(int nid, > >>>>> unsigned long zone_type, > >>>>> @@ -4358,6 +4430,10 @@ static inline unsigned long __meminit zone_absent_pages_in_node(int nid, > >>>>> return zholes_size[zone_type]; > >>>>> } > >>>>> > >>>>> +static void __meminit sanitize_zone_movable_limit(void) > >>>>> +{ > >>>>> +} > >>>>> + > >>>>> #endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */ > >>>>> > >>>>> static void __meminit calculate_node_totalpages(struct pglist_data *pgdat, > >>>>> @@ -4923,6 +4999,7 @@ void __init free_area_init_nodes(unsigned long *max_zone_pfn) > >>>>> > >>>>> /* Find the PFNs that ZONE_MOVABLE begins at in each node */ > >>>>> memset(zone_movable_pfn, 0, sizeof(zone_movable_pfn)); > >>>>> + sanitize_zone_movable_limit(); > >>>>> find_zone_movable_pfns_for_nodes(); > >>>>> > >>>>> /* Print out the zone ranges */ > >>>> > >>>> > >>>> > >>>> -- > >>>> To unsubscribe, send a message with 'unsubscribe linux-mm' in > >>>> the body to majordomo@xxxxxxxxx. For more info on Linux MM, > >>>> see: http://www.linux-mm.org/ . > >>>> Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a> > >>> > >>> > >>> > >>> . > >>> > >> > >> > >> > > > > > > > > . > > > > > -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html