Re: [PATCH] mm, page_alloc: clear zone_movable_pfn if the node doesn't have ZONE_MOVABLE

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Dec 18, 2018 at 01:14:51PM +0100, Michal Hocko wrote:
>On Mon 17-12-18 14:18:02, Wei Yang wrote:
>> On Mon, Dec 17, 2018 at 11:25:34AM +0100, Michal Hocko wrote:
>> >On Sun 16-12-18 20:56:24, Wei Yang wrote:
>> >> A non-zero zone_movable_pfn indicates this node has ZONE_MOVABLE, while
>> >> current implementation doesn't comply with this rule when kernel
>> >> parameter "kernelcore=" is used.
>> >> 
>> >> Current implementation doesn't harm the system, since the value in
>> >> zone_movable_pfn is out of the range of current zone. While user would
>> >> see this message during bootup, even that node doesn't has ZONE_MOVABLE.
>> >> 
>> >>     Movable zone start for each node
>> >>       Node 0: 0x0000000080000000
>> >
>> >I am sorry but the above description confuses me more than it helps.
>> >Could you start over again and describe the user visible problem, then
>> >follow up with the udnerlying bug and finally continue with a proposed
>> >fix?
>> 
>> Yep, how about this one:
>> 
>> For example, a machine with 8G RAM, 2 nodes with 4G on each, if we pass
>
>Did you mean 2G on each? Because your nodes do have 2GB each.
>
>> "kernelcore=2G" as kernel parameter, the dmesg looks like:
>> 
>>      Movable zone start for each node
>>        Node 0: 0x0000000080000000
>>        Node 1: 0x0000000100000000
>> 
>> This looks like both Node 0 and 1 has ZONE_MOVABLE, while the following
>> dmesg shows only Node 1 has ZONE_MOVABLE.
>
>Well, the documentation says
>	kernelcore=	[KNL,X86,IA-64,PPC]
>			Format: nn[KMGTPE] | nn% | "mirror"
>			This parameter specifies the amount of memory usable by
>			the kernel for non-movable allocations.  The requested
>			amount is spread evenly throughout all nodes in the
>			system as ZONE_NORMAL.  The remaining memory is used for
>			movable memory in its own zone, ZONE_MOVABLE.  In the
>			event, a node is too small to have both ZONE_NORMAL and
>			ZONE_MOVABLE, kernelcore memory will take priority and
>			other nodes will have a larger ZONE_MOVABLE.

Yes, current behavior is a little bit different.

When you look at find_usable_zone_for_movable(), the ZONE_MOVABLE is in the
highest ZONE. Which means if a node doesn't has the highest zone, all
its memory belongs to kernelcore.

Looks like a design decision?

>
>>      On node 0 totalpages: 524190
>>        DMA zone: 64 pages used for memmap
>>        DMA zone: 21 pages reserved
>>        DMA zone: 3998 pages, LIFO batch:0
>>        DMA32 zone: 8128 pages used for memmap
>>        DMA32 zone: 520192 pages, LIFO batch:63
>>      
>>      On node 1 totalpages: 524255
>>        DMA32 zone: 4096 pages used for memmap
>>        DMA32 zone: 262111 pages, LIFO batch:63
>>        Movable zone: 4096 pages used for memmap
>>        Movable zone: 262144 pages, LIFO batch:63
>
>so assuming your really have 4GB in total and 2GB should be in kernel
>zones then each node should get half of it to kernel zones and the
>remaining 2G evenly distributed to movable zones. So something seems
>broken here.

In case we really have this implemented. We will have following memory
layout.


    +---------+------+---------+--------+------------+
    |DMA      |DMA32 |Movable  |DMA32   |Movable     |
    +---------+------+---------+--------+------------+
    |<        Node 0          >|<      Node 1       >|

This means we have none-monotonic increasing zone.

Is this what we expect now? If this is, we really have someting broken.

>-- 
>Michal Hocko
>SUSE Labs

-- 
Wei Yang
Help you, Help me




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux