Re: [Bugme-new] [Bug 36192] New: Kernel panic when boot the 2.6.39+ kernel based off of 2.6.32 kernel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jun 08, 2011 at 09:42:19AM +0900, KAMEZAWA Hiroyuki wrote:
> On Wed, 8 Jun 2011 08:40:34 +0900
> <SNIP>

Missing a subject 

> 
> With sparsemem, page_cgroup_init scans pfn from 0 to max_pfn.
> But this may scan a pfn which is not on any node and can access
> memmap which is not initialized.
> 
> This makes page_cgroup_init() for SPARSEMEM node aware and remove
> a code to get nid from page->flags. (Then, we'll use valid NID
> always.)
> 
> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx>
> ---
>  mm/page_cgroup.c |   41 +++++++++++++++++++++++++++++++++--------
>  1 file changed, 33 insertions(+), 8 deletions(-)
> 
> Index: linux-3.0-rc1/mm/page_cgroup.c
> ===================================================================
> --- linux-3.0-rc1.orig/mm/page_cgroup.c
> +++ linux-3.0-rc1/mm/page_cgroup.c
> @@ -162,21 +162,25 @@ static void free_page_cgroup(void *addr)
>  }
>  #endif
>  
> -static int __meminit init_section_page_cgroup(unsigned long pfn)
> +static int __meminit init_section_page_cgroup(unsigned long pfn, int nid)
>  {
>  	struct page_cgroup *base, *pc;
>  	struct mem_section *section;
>  	unsigned long table_size;
>  	unsigned long nr;
> -	int nid, index;
> +	int index;
>  
> +	/*
> +	 * Even if passed 'pfn' is not aligned to section, we need to align
> +	 * it to section boundary because of SPARSEMEM pfn calculation.
> +	 */
> +	pfn = ALIGN(pfn, PAGES_PER_SECTION);
>  	nr = pfn_to_section_nr(pfn);

This comment is a bit opaque and from the context of the patch,
it's hard to know why the alignment is necessary. At least move the
alignment to beside where section->page_cgroup is set because it'll
be easier to understand what is going on and why.

>  	section = __nr_to_section(nr);
>  
>  	if (section->page_cgroup)
>  		return 0;
>  
> -	nid = page_to_nid(pfn_to_page(pfn));
>  	table_size = sizeof(struct page_cgroup) * PAGES_PER_SECTION;
>  	base = alloc_page_cgroup(table_size, nid);
>  
> @@ -228,7 +232,7 @@ int __meminit online_page_cgroup(unsigne
>  	for (pfn = start; !fail && pfn < end; pfn += PAGES_PER_SECTION) {
>  		if (!pfn_present(pfn))
>  			continue;
> -		fail = init_section_page_cgroup(pfn);
> +		fail = init_section_page_cgroup(pfn, nid);
>  	}
>  	if (!fail)
>  		return 0;
> @@ -285,14 +289,35 @@ void __init page_cgroup_init(void)
>  {
>  	unsigned long pfn;
>  	int fail = 0;
> +	int node;
>  

Very nit-picky but you sometimes use node and sometimes use nid.
Personally, nid is my preferred choice of name as its meaning is
unambigious.

>  	if (mem_cgroup_disabled())
>  		return;
>  
> -	for (pfn = 0; !fail && pfn < max_pfn; pfn += PAGES_PER_SECTION) {
> -		if (!pfn_present(pfn))
> -			continue;
> -		fail = init_section_page_cgroup(pfn);
> +	for_each_node_state(node, N_HIGH_MEMORY) {
> +		unsigned long start_pfn, end_pfn;
> +
> +		start_pfn = NODE_DATA(node)->node_start_pfn;
> +		end_pfn = start_pfn + NODE_DATA(node)->node_spanned_pages;
> +		/*
> +		 * Because we cannot trust page->flags of page out of node
> +		 * boundary, we skip pfn < start_pfn.
> +		 */
> +		for (pfn = start_pfn;
> +		     !fail && (pfn < end_pfn);
> +		     pfn = ALIGN(pfn + PAGES_PER_SECTION, PAGES_PER_SECTION)) {
> +			if (!pfn_present(pfn))
> +				continue;

Why did you not use pfn_valid()? 

pfn_valid checks a section has SECTION_HAS_MEM_MAP
pfn_present checks a section has SECTION_MARKED_PRESENT

SECTION_MARKED_PRESENT does not necessarily mean mem_map has been
allocated although I admit that this is somewhat unlikely. I'm just
curious if you had a reason for avoiding pfn_valid()?

> +			/*
> +			 * Nodes can be overlapped
> +			 * We know some arch can have nodes layout as
> +			 * -------------pfn-------------->
> +			 * N0 | N1 | N2 | N0 | N1 | N2 |.....
> +			 */
> +			if (pfn_to_nid(pfn) != node)
> +				continue;
> +			fail = init_section_page_cgroup(pfn, node);
> +		}
>  	}
>  	if (fail) {
>  		printk(KERN_CRIT "try 'cgroup_disable=memory' boot option\n");
> 

FWIW, overall I think this is heading in the right direction.

Thanks.

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]