Re: [Bug 206401] kernel panic on Hyper-V after 5 minutes due to memory hot-add

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 17.02.20 11:33, Baoquan He wrote:
> On 02/17/20 at 11:24am, David Hildenbrand wrote:
>> On 17.02.20 11:13, Baoquan He wrote:
>>> On 02/17/20 at 10:34am, Oscar Salvador wrote:
>>>> On Mon, Feb 17, 2020 at 02:46:27PM +0900, kkabe@xxxxxxxxxxx wrote:
>>>>> ===========================================
>>>>> struct page * __meminit populate_section_memmap(unsigned long pfn,
>>>>>                 unsigned long nr_pages, int nid, struct vmem_altmap *altmap)
>>>>> {
>>>>>         struct page *page, *ret;
>>>>>         unsigned long memmap_size = sizeof(struct page) * PAGES_PER_SECTION;
>>>>>
>>>>>         page = alloc_pages(GFP_KERNEL|__GFP_NOWARN, get_order(memmap_size));
>>>>>         if (page) {
>>>>>                 goto got_map_page;
>>>>>         }
>>>>> pr_info("%s: alloc_pages() returned 0x%p (should be 0), reverting to vmalloc(memmap_size=%lu)\n", __func__, page, memmap_size);
>>>>> BUG_ON(page != 0);
>>>>>
>>>>>         ret = vmalloc(memmap_size);
>>>>> pr_info("%s: vmalloc(%lu) returned 0x%p\n", __func__, memmap_size, ret);
>>>>>         if (ret) {
>>>>>                 goto got_map_ptr;
>>>>>         }
>>>>>
>>>>>         return NULL;
>>>>> got_map_page:
>>>>>         ret = (struct page *)pfn_to_kaddr(page_to_pfn(page));
>>>>> pr_info("%s: allocated struct page *page=0x%p\n", __func__, page);
>>>>> got_map_ptr:
>>>>>
>>>>> pr_info("%s: returning struct page * =0x%p\n", __func__, ret);
>>>>>         return ret;
>>>>> }
>>>>
>>>> Could you please replace %p with %px. Wih the first, pointers are hashed so it is trickier
>>>> to get an overview of the meaning.
>>>>
>>>> David could be right about ZONE_NORMAL vs ZONE_HIGHMEM.
>>>> IIUC, default_kernel_zone_for_pfn and default_zone_for_pfn seem to only deal with
>>>> (ZONE_DMA,ZONE_NORMAL] or ZONE_MOVABLE.
>>>
>>> Ah, I think you both have spotted the problem.
>>>  
>>> In i386, if w/o momory hot add, normal memory will only include those
>>> below 896M and they are added into normal zone. The left are added into
>>> highmem zone.
>>>  
>>> How this influence the page allocation?
>>>  
>>> Very huge. As we know, in i386, normal memory can be accessed with
>>> virt_to_phys, namely PAGE_OFFSET + phys. But highmem has to be accessed
>>> with kmap. However, the later hot added memory are all put into normal
>>> memmory, accessing into them will stump into vmalloc area, I would say.
>>>  
>>> So, i386 doesn't support memory hot add well.  Not sure if below change
>>> can make it work normally.
>>>  
>>> We can just adjus the hot adding code as we have done for boot memmory.
>>> Iterate zone from highmem if allowed when hot add memory.
>>>  
>>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>>> index 475d0d68a32c..1380392d9ef5 100644
>>> --- a/mm/memory_hotplug.c
>>> +++ b/mm/memory_hotplug.c
>>> @@ -716,7 +716,10 @@ static struct zone *default_kernel_zone_for_pfn(int nid, unsigned long start_pfn
>>>  	struct pglist_data *pgdat = NODE_DATA(nid);
>>>  	int zid;
>>>  
>>> -	for (zid = 0; zid <= ZONE_NORMAL; zid++) {
>>> +	for (zid = 0; zid < MAX_NR_ZONES; zid++) {
>>
>> ZONE_DEVICE? :/
> 
> Not sure if ZONE_DEVICE will be supported on 32 bit system.
> 
> 
>>
>>> +		if (zid == ZONE_MOVABLE)
>>> +			continue;
>>> +
>>>  		struct zone *zone = &pgdat->node_zones[zid];
>>>  
>>>  		if (zone_intersects(zone, start_pfn, nr_pages))
>>>
>>>
>>
>> What if somebody onlines memory from user space explicitly to the normal
>> zone? We can trigger crashes?
> 
> Seems the current i386 code doesn't support it. Unless we change that
> too. If not reserving virtual address space, later added any memory has
> to be highmem.
> 
>>
>> This doesn't look like it ever worked reliably, can we just disable
>> memory hotplug in case we have PAE? (especially, as continued i386
>> support is questionable)
> 
> This is not PAE, this is only HIGHMEM4G.
> 

Ah, okay. Anyhow, highmem combined with hotplug seems to be in a
questionable state. I'd vote for disabling it if possible.

-- 
Thanks,

David / dhildenb





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux