Re: A crash on ARM64 in move_freepages_block due to uninitialized pages in reserved memory

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Michal,

(CC: +Ard)

On 24/08/18 12:41, Michal Hocko wrote:
> On Thu 23-08-18 15:06:08, James Morse wrote:
> [...]
>> My best-guess is that pfn_valid_within() shouldn't be optimised out if
> ARCH_HAS_HOLES_MEMORYMODEL, even if HOLES_IN_ZONE isn't set.
>>
>> Does something like this solve the problem?:
>> ============================%<============================
>> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
>> index 32699b2dc52a..5e27095a15f4 100644
>> --- a/include/linux/mmzone.h
>> +++ b/include/linux/mmzone.h
>> @@ -1295,7 +1295,7 @@ void memory_present(int nid, unsigned long start, unsigned
>> long end);
>>   * pfn_valid_within() should be used in this case; we optimise this away
>>   * when we have no holes within a MAX_ORDER_NR_PAGES block.
>>   */
>> -#ifdef CONFIG_HOLES_IN_ZONE
>> +#if defined(CONFIG_HOLES_IN_ZONE) || defined(CONFIG_ARCH_HAS_HOLES_MEMORYMODEL)
>>  #define pfn_valid_within(pfn) pfn_valid(pfn)
>>  #else
>>  #define pfn_valid_within(pfn) (1)
>> ============================%<============================

After plenty of greping, git-archaeology and help from others, I think I've a
clearer picture of what these options do.


Please correct me if I've explained something wrong here:

> This is the first time I hear about CONFIG_ARCH_HAS_HOLES_MEMORYMODEL.

The comment in include/linux/mmzone.h describes this as relevant when parts the
memmap have been free()d. This would happen on systems where memory is smaller
than a sparsemem-section, and the extra struct pages are expensive.
pfn_valid() on these systems returns true for the whole sparsemem-section, so an
extra memmap_valid_within() check is needed.

This is independent of nomap, and isn't relevant on arm64 as our pfn_valid()
always tests the page in memblock due to nomap pages, which can occur anywhere.
(I will propose a patch removing ARCH_HAS_HOLES_MEMORYMODEL for arm64.)


HOLES_IN_ZONE is similar, if some memory is smaller than MAX_ORDER_NR_PAGES,
possibly due to nomap holes.

6d526ee26ccd only enabled it for NUMA systems on arm64, because the NUMA code
was first to fall foul of this, but there is nothing NUMA specific about nomap
holes within a MAX_ORDER_NR_PAGES region.

I'm convinced arm64 should always enable HOLES_IN_ZONE because nomap pages can
occur anywhere. I'll post a fix.


Is it valid to have HOLES_IN_ZONE and !HAVE_ARCH_PFN_VALID?
This would mean pfn_valid_within() is necessary, but pfn_valid() is only looking
at sparse-sections. It looks like ia64 and mips:CAVIUM_OCTEON_SOC are both
configured like this...


> Why it doesn't imply CONFIG_HOLES_IN_ZONE?

I guess the size values for sparsemem-section and MAX_ORDER_NR_PAGES may support
HAS_HOLES_MEMORYMODEL but not HOLES_IN_ZONE. e.g. if only 128Mb of memory
existed in a 256Mb sparsemem-section, but the 4Mb MAX_ORDER_NR_PAGES are always
present if any of their pages are present.


>>> I analyzed the assembler:
>>> PageBuddy in move_freepages returns false
>>> Then we call PageLRU, the macro calls PF_HEAD which is compound_page()
>>> compound_page reads page->compound_head, it is 0xffffffffffffffff, so it
>>> resturns 0xfffffffffffffffe - and accessing this address causes crash
>>
>> Thanks!
>> That wasn't straightforward to work out without the vmlinux.
>>
>> Because you see all-ones, even in KVM, it looks like the struct page is being
>> initialized like that deliberately... I haven't found where this might be happening.
> 
> It should be
> 
> sparse_add_one_section
> #ifdef CONFIG_DEBUG_VM
> 	/*
> 	 * Poison uninitialized struct pages in order to catch invalid flags
> 	 * combinations.
> 	 */
> 	memset(memmap, PAGE_POISON_PATTERN, sizeof(struct page) * PAGES_PER_SECTION);
> #endif

Aha, thanks. (I expected KVMs uninitialized memory to always be zero).


Thanks!

James




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux