On Fri, Jul 13, 2018 at 07:52:40AM +0800, Baoquan He wrote: >Hi Michal, > >On 07/12/18 at 02:32pm, Michal Hocko wrote: >> On Thu 12-07-18 14:01:15, Chao Fan wrote: >> > On Thu, Jul 12, 2018 at 01:49:49PM +0800, Dou Liyang wrote: >> > >Hi Baoquan, >> > > >> > >At 07/11/2018 08:40 PM, Baoquan He wrote: >> > >> Please try this v3 patch: >> > >> >>From 9850d3de9c02e570dc7572069a9749a8add4c4c7 Mon Sep 17 00:00:00 2001 >> > >> From: Baoquan He <bhe@xxxxxxxxxx> >> > >> Date: Wed, 11 Jul 2018 20:31:51 +0800 >> > >> Subject: [PATCH v3] mm, page_alloc: find movable zone after kernel text >> > >> >> > >> In find_zone_movable_pfns_for_nodes(), when try to find the starting >> > >> PFN movable zone begins in each node, kernel text position is not >> > >> considered. KASLR may put kernel after which movable zone begins. >> > >> >> > >> Fix it by finding movable zone after kernel text on that node. >> > >> >> > >> Signed-off-by: Baoquan He <bhe@xxxxxxxxxx> >> > > >> > > >> > >You fix this in the _zone_init side_. This may make the 'kernelcore=' or >> > >'movablecore=' failed if the KASLR puts the kernel back the tail of the >> > >last node, or more. >> > >> > I think it may not fail. >> > There is a 'restart' to do another pass. >> > >> > > >> > >Due to we have fix the mirror memory in KASLR side, and Chao is trying >> > >to fix the 'movable_node' in KASLR side. Have you had a chance to fix >> > >this in the KASLR side. >> > > >> > >> > I think it's better to fix here, but not KASLR side. >> > Cause much more code will be change if doing it in KASLR side. >> > Since we didn't parse 'kernelcore' in compressed code, and you can see >> > the distribution of ZONE_MOVABLE need so much code, so we do not need >> > to do so much job in KASLR side. But here, several lines will be OK. >> >> I am not able to find the beginning of the email thread right now. Could >> you summarize what is the actual problem please? > >The bug is found on x86 now. > >When added "kernelcore=" or "movablecore=" into kernel command line, >kernel memory is spread evenly among nodes. However, this is right when >KASLR is not enabled, then kernel will be at 16M of place in x86 arch. >If KASLR enabled, it could be put any place from 16M to 64T randomly. > >Consider a scenario, we have 10 nodes, and each node has 20G memory, and >we specify "kernelcore=50%", means each node will take 10G for >kernelcore, 10G for movable area. But this doesn't take kernel position >into consideration. E.g if kernel is put at 15G of 2nd node, namely >node1. Then we think on node1 there's 10G for kernelcore, 10G for >movable, in fact there's only 5G available for movable, just after >kernel. > >I made a v4 patch which possibly can fix it. > > >From dbcac3631863aed556dc2c4ff1839772dfd02d18 Mon Sep 17 00:00:00 2001 >From: Baoquan He <bhe@xxxxxxxxxx> >Date: Fri, 13 Jul 2018 07:49:29 +0800 >Subject: [PATCH v4] mm, page_alloc: find movable zone after kernel text > >In find_zone_movable_pfns_for_nodes(), when try to find the starting >PFN movable zone begins at in each node, kernel text position is not >considered. KASLR may put kernel after which movable zone begins. > >Fix it by finding movable zone after kernel text on that node. > >Signed-off-by: Baoquan He <bhe@xxxxxxxxxx> You can post it as alone PATCH, then I will test it next week. Thanks, Chao Fan >--- > mm/page_alloc.c | 15 +++++++++++++-- > 1 file changed, 13 insertions(+), 2 deletions(-) > >diff --git a/mm/page_alloc.c b/mm/page_alloc.c >index 1521100f1e63..5bc1a47dafda 100644 >--- a/mm/page_alloc.c >+++ b/mm/page_alloc.c >@@ -6547,7 +6547,7 @@ static unsigned long __init early_calculate_totalpages(void) > static void __init find_zone_movable_pfns_for_nodes(void) > { > int i, nid; >- unsigned long usable_startpfn; >+ unsigned long usable_startpfn, kernel_endpfn, arch_startpfn; > unsigned long kernelcore_node, kernelcore_remaining; > /* save the state before borrow the nodemask */ > nodemask_t saved_node_state = node_states[N_MEMORY]; >@@ -6649,8 +6649,9 @@ static void __init find_zone_movable_pfns_for_nodes(void) > if (!required_kernelcore || required_kernelcore >= totalpages) > goto out; > >+ kernel_endpfn = PFN_UP(__pa_symbol(_end)); > /* usable_startpfn is the lowest possible pfn ZONE_MOVABLE can be at */ >- usable_startpfn = arch_zone_lowest_possible_pfn[movable_zone]; >+ arch_startpfn = arch_zone_lowest_possible_pfn[movable_zone]; > > restart: > /* Spread kernelcore memory as evenly as possible throughout nodes */ >@@ -6659,6 +6660,16 @@ static void __init find_zone_movable_pfns_for_nodes(void) > unsigned long start_pfn, end_pfn; > > /* >+ * KASLR may put kernel near tail of node memory, >+ * start after kernel on that node to find PFN >+ * at which zone begins. >+ */ >+ if (pfn_to_nid(kernel_endpfn) == nid) >+ usable_startpfn = max(arch_startpfn, kernel_endpfn); >+ else >+ usable_startpfn = arch_startpfn; >+ >+ /* > * Recalculate kernelcore_node if the division per node > * now exceeds what is necessary to satisfy the requested > * amount of memory for the kernel >-- >2.13.6 > > >