Re: [PATCH V2 1/2] mm/page_alloc: Replace set_dma_reserve to set_memory_reserve

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 08/05/2016 09:24 AM, Srikar Dronamraju wrote:
* Vlastimil Babka <vbabka@xxxxxxx> [2016-08-05 08:45:03]:

@@ -5493,10 +5493,10 @@ static void __paginginit free_area_init_core(struct pglist_data *pgdat)
		}

		/* Account for reserved pages */
-		if (j == 0 && freesize > dma_reserve) {
-			freesize -= dma_reserve;
+		if (j == 0 && freesize > nr_memory_reserve) {

Will this really work (together with patch 2) as intended?
This j == 0 means that we are doing this only for the first zone, which is
ZONE_DMA (or ZONE_DMA32) on node 0 on many systems. I.e. I don't think it's
really true that "dma_reserve has nothing to do with DMA or ZONE_DMA".

This zone will have limited amount of memory, so the "freesize >
nr_memory_reserve" will easily be false once you set this to many gigabytes,
so in fact nothing will get subtracted.

On the other hand if the kernel has both CONFIG_ZONE_DMA and
CONFIG_ZONE_DMA32 disabled, then j == 0 will be true for ZONE_NORMAL. This
zone might be present on multiple nodes (unless they are configured as
movable) and then the value intended to be global will be subtracted from
several nodes.

I don't know what's the exact ppc64 situation here, perhaps there are indeed
no DMA/DMA32 zones, and the fadump kernel only uses one node, so it works in
the end, but it doesn't seem much robust to me?


At the page initialization time, powerpc seems to have just one zone
spread across the 16 nodes.

From the dmesg.

[    0.000000] Memory hole size: 0MB
[    0.000000] Zone ranges:
[    0.000000]   DMA      [mem 0x0000000000000000-0x00001f5c8fffffff]
[    0.000000]   DMA32    empty
[    0.000000]   Normal   empty
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000000000000-0x000001fb4fffffff]
[    0.000000]   node   1: [mem 0x000001fb50000000-0x000003fa8fffffff]
[    0.000000]   node   2: [mem 0x000003fa90000000-0x000005f9cfffffff]
[    0.000000]   node   3: [mem 0x000005f9d0000000-0x000007f8efffffff]
[    0.000000]   node   4: [mem 0x000007f8f0000000-0x000009f81fffffff]
[    0.000000]   node   5: [mem 0x000009f820000000-0x00000bf77fffffff]
[    0.000000]   node   6: [mem 0x00000bf780000000-0x00000df6dfffffff]
[    0.000000]   node   7: [mem 0x00000df6e0000000-0x00000ff63fffffff]
[    0.000000]   node   8: [mem 0x00000ff640000000-0x000011f58fffffff]
[    0.000000]   node   9: [mem 0x000011f590000000-0x000013644fffffff]
[    0.000000]   node  10: [mem 0x0000136450000000-0x00001563afffffff]
[    0.000000]   node  11: [mem 0x00001563b0000000-0x000017630fffffff]
[    0.000000]   node  12: [mem 0x0000176310000000-0x000019625fffffff]
[    0.000000]   node  13: [mem 0x0000196260000000-0x00001b5dcfffffff]
[    0.000000]   node  14: [mem 0x00001b5dd0000000-0x00001d5d2fffffff]
[    0.000000]   node  15: [mem 0x00001d5d30000000-0x00001f5c8fffffff]

Hmm so it will work for ppc64 and its fadump, but I'm not happy that we made the function name sound like it's generic (unlike when the name contained "dma"), while it only works as intended in specific corner cases. The next user might be surprised...

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]