On 3/8/20 11:59 PM, Prabhakar Kushwaha wrote:
. Hi John,
On Sun, Mar 8, 2020 at 12:13 AM John Donnelly
<john.p.donnelly@xxxxxxxxxx> wrote:
On Mar 7, 2020, at 5:06 AM, Chen Zhou <chenzhou10@xxxxxxxxxx> wrote:
On 2020/3/5 18:13, Prabhakar Kushwaha wrote:
On Mon, Dec 23, 2019 at 8:57 PM Chen Zhou <chenzhou10@xxxxxxxxxx> wrote:
Crashkernel=X tries to reserve memory for the crash dump kernel under
4G. If crashkernel=X,low is specified simultaneously, reserve spcified
size low memory for crash kdump kernel devices firstly and then reserve
memory above 4G.
Signed-off-by: Chen Zhou <chenzhou10@xxxxxxxxxx>
---
arch/arm64/kernel/setup.c | 8 +++++++-
arch/arm64/mm/init.c | 31 +++++++++++++++++++++++++++++--
2 files changed, 36 insertions(+), 3 deletions(-)
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 56f6645..04d1c87 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -238,7 +238,13 @@ static void __init request_standard_resources(void)
kernel_data.end <= res->end)
request_resource(res, &kernel_data);
#ifdef CONFIG_KEXEC_CORE
- /* Userspace will find "Crash kernel" region in /proc/iomem. */
+ /*
+ * Userspace will find "Crash kernel" region in /proc/iomem.
+ * Note: the low region is renamed as Crash kernel (low).
+ */
+ if (crashk_low_res.end && crashk_low_res.start >= res->start &&
+ crashk_low_res.end <= res->end)
+ request_resource(res, &crashk_low_res);
if (crashk_res.end && crashk_res.start >= res->start &&
crashk_res.end <= res->end)
request_resource(res, &crashk_res);
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index b65dffd..0d7afd5 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -80,6 +80,7 @@ static void __init reserve_crashkernel(void)
{
unsigned long long crash_base, crash_size;
int ret;
+ phys_addr_t crash_max = arm64_dma32_phys_limit;
ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
&crash_size, &crash_base);
@@ -87,12 +88,38 @@ static void __init reserve_crashkernel(void)
if (ret || !crash_size)
return;
+ ret = reserve_crashkernel_low();
+ if (!ret && crashk_low_res.end) {
+ /*
+ * If crashkernel=X,low specified, there may be two regions,
+ * we need to make some changes as follows:
+ *
+ * 1. rename the low region as "Crash kernel (low)"
+ * In order to distinct from the high region and make no effect
+ * to the use of existing kexec-tools, rename the low region as
+ * "Crash kernel (low)".
+ *
+ * 2. change the upper bound for crash memory
+ * Set MEMBLOCK_ALLOC_ACCESSIBLE upper bound for crash memory.
+ *
+ * 3. mark the low region as "nomap"
+ * The low region is intended to be used for crash dump kernel
+ * devices, just mark the low region as "nomap" simply.
+ */
+ const char *rename = "Crash kernel (low)";
+
+ crashk_low_res.name = rename;
+ crash_max = MEMBLOCK_ALLOC_ACCESSIBLE;
+ memblock_mark_nomap(crashk_low_res.start,
+ resource_size(&crashk_low_res));
+ }
+
crash_size = PAGE_ALIGN(crash_size);
if (crash_base == 0) {
/* Current arm64 boot protocol requires 2MB alignment */
- crash_base = memblock_find_in_range(0, arm64_dma32_phys_limit,
- crash_size, SZ_2M);
+ crash_base = memblock_find_in_range(0, crash_max, crash_size,
+ SZ_2M);
if (crash_base == 0) {
pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
crash_size);
--
I tested this patch series on ARM64-ThunderX2 with no issue with
bootargs crashkenel=X@Y crashkernel=250M,low
$ dmesg | grep crash
[ 0.000000] crashkernel reserved: 0x0000000b81200000 -
0x0000000c81200000 (4096 MB)
[ 0.000000] Kernel command line:
BOOT_IMAGE=/boot/vmlinuz-5.6.0-rc4+
root=UUID=866b8df3-14f4-4e11-95a1-74a90ee9b694 ro
crashkernel=4G@0xb81200000 crashkernel=250M,low nowatchdog earlycon
[ 29.310209] crashkernel=250M,low
$ kexec -p -i /boot/vmlinuz-`uname -r`
--initrd=/boot/initrd.img-`uname -r` --reuse-cmdline
$ echo 1 > /proc/sys/kernel/sysrq ; echo c > /proc/sysrq-trigger
But when i tried with crashkernel=4G crashkernel=250M,low as bootargs.
Kernel is not able to allocate memory.
[ 0.000000] cannot allocate crashkernel (size:0x100000000)
[ 0.000000] Kernel command line:
BOOT_IMAGE=/boot/vmlinuz-5.6.0-rc4+
root=UUID=866b8df3-14f4-4e11-95a1-74a90ee9b694 ro crashkernel=4G
crashkernel=250M,low nowatchdog
[ 29.332081] crashkernel=250M,low
does crashkernel=X@Y mandatory to get allocated beyond 4G?
am I missing something?
crashkernel=4G
You need to look at the memory map on node 0 from dmesg ( or /proc/iomem ) to determine if there is any memory in that range - 0x100000000 == 1st byte above 4G .
i believe i have enough free memory. Please find log below
$ dmesg | grep "node 0"
[ 0.000000] Initmem setup node 0 [mem 0x00000000802f0000-0x0000009ffcffffff]
[ 0.000000] On node 0 totalpages: 33537296
[ 12.335714] pci_bus 0000:00: on NUMA node 0
$
I am passing 4G@0xb81200000 in working scenario, here 0xb81200000 is
well within node 0 range.
Logs of iomem is below:
$ cat /proc/iomem
00000000-00000000 : PCI ECAM
00000000-00000000 : PCI ECAM
00000000-00000000 : PCI Bus 0000:00
00000000-00000000 : PCI Bus 0000:0f
00000000-00000000 : PCI Bus 0000:10
00000000-00000000 : 0000:10:00.0
00000000-00000000 : 0000:10:00.0
00000000-00000000 : PCI Bus 0000:01
00000000-00000000 : 0000:01:00.0
00000000-00000000 : 0000:01:00.1
00000000-00000000 : PCI Bus 0000:05
00000000-00000000 : 0000:05:00.0
00000000-00000000 : 0000:05:00.1
00000000-00000000 : PCI Bus 0000:09
00000000-00000000 : 0000:09:00.0
00000000-00000000 : 0000:09:00.1
00000000-00000000 : 0000:00:10.0
00000000-00000000 : ahci
00000000-00000000 : 0000:00:10.1
00000000-00000000 : ahci
00000000-00000000 : PCI Bus 0000:80
00000000-00000000 : PCI Bus 0000:83
00000000-00000000 : 0000:83:00.0
00000000-00000000 : 0000:83:00.0
00000000-00000000 : nvme
00000000-00000000 : PCI Bus 0000:89
00000000-00000000 : 0000:89:00.0
00000000-00000000 : e1000e
00000000-00000000 : 0000:89:00.0
00000000-00000000 : 0000:89:00.0
00000000-00000000 : e1000e
00000000-00000000 : 0000:89:00.0
00000000-00000000 : e1000e
00000000-00000000 : PCI Bus 0000:8d
00000000-00000000 : 0000:8d:00.0
00000000-00000000 : 0000:8d:00.0
00000000-00000000 : mpt3sas
00000000-00000000 : reserved
00000000-00000000 : System RAM
00000000-00000000 : Kernel code
00000000-00000000 : reserved
00000000-00000000 : Kernel data
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : System RAM
00000000-00000000 : reserved
00000000-00000000 : System RAM
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : System RAM
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : System RAM
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : System RAM
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : System RAM
00000000-00000000 : reserved
00000000-00000000 : CAV901C:00
00000000-00000000 : CAV901D:00
00000000-00000000 : CAV901C:00
00000000-00000000 : CAV901E:00
00000000-00000000 : CAV901C:00
00000000-00000000 : CAV901F:00
00000000-00000000 : CAV901C:00
00000000-00000000 : CAV9006:00
00000000-00000000 : CAV9006:00
00000000-00000000 : ARMH0011:00
00000000-00000000 : ARMH0011:00
00000000-00000000 : arm-smmu-v3.0.auto
00000000-00000000 : arm-smmu-v3.0.auto
00000000-00000000 : arm-smmu-v3.1.auto
00000000-00000000 : arm-smmu-v3.1.auto
00000000-00000000 : arm-smmu-v3.2.auto
00000000-00000000 : arm-smmu-v3.2.auto
00000000-00000000 : CAV901C:01
00000000-00000000 : CAV901D:01
00000000-00000000 : CAV901C:01
00000000-00000000 : CAV901E:01
00000000-00000000 : CAV901C:01
00000000-00000000 : CAV901F:01
00000000-00000000 : CAV901C:01
00000000-00000000 : CAV9007:06
00000000-00000000 : CAV9007:06
00000000-00000000 : arm-smmu-v3.3.auto
00000000-00000000 : arm-smmu-v3.3.auto
00000000-00000000 : arm-smmu-v3.4.auto
00000000-00000000 : arm-smmu-v3.4.auto
00000000-00000000 : arm-smmu-v3.5.auto
00000000-00000000 : arm-smmu-v3.5.auto
00000000-00000000 : System RAM
00000000-00000000 : System RAM
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : System RAM
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : reserved
00000000-00000000 : PCI Bus 0000:00
00000000-00000000 : PCI Bus 0000:01
00000000-00000000 : 0000:01:00.0
00000000-00000000 : 0000:01:00.1
00000000-00000000 : 0000:01:00.0
00000000-00000000 : 0000:01:00.1
00000000-00000000 : 0000:01:00.0
00000000-00000000 : 0000:01:00.1
00000000-00000000 : PCI Bus 0000:05
00000000-00000000 : 0000:05:00.0
00000000-00000000 : bnx2x
00000000-00000000 : 0000:05:00.1
00000000-00000000 : bnx2x
00000000-00000000 : 0000:05:00.0
00000000-00000000 : bnx2x
00000000-00000000 : 0000:05:00.0
00000000-00000000 : bnx2x
00000000-00000000 : 0000:05:00.1
00000000-00000000 : bnx2x
00000000-00000000 : 0000:05:00.1
00000000-00000000 : bnx2x
00000000-00000000 : PCI Bus 0000:09
00000000-00000000 : 0000:09:00.0
00000000-00000000 : i40e
00000000-00000000 : 0000:09:00.1
00000000-00000000 : i40e
00000000-00000000 : 0000:09:00.0
00000000-00000000 : 0000:09:00.1
00000000-00000000 : 0000:09:00.0
00000000-00000000 : i40e
00000000-00000000 : 0000:09:00.1
00000000-00000000 : i40e
00000000-00000000 : 0000:09:00.0
00000000-00000000 : 0000:09:00.1
00000000-00000000 : 0000:00:0f.0
00000000-00000000 : xhci-hcd
00000000-00000000 : 0000:00:0f.0
00000000-00000000 : 0000:00:0f.1
00000000-00000000 : xhci-hcd
00000000-00000000 : 0000:00:0f.1
00000000-00000000 : 0000:00:10.0
00000000-00000000 : ahci
00000000-00000000 : 0000:00:10.1
00000000-00000000 : ahci
00000000-00000000 : PCI Bus 0000:80
These appear all zero to me !
here is 1 flavor I have :
[root@ca-dev-arm19 ~]# cat /proc/iomem
12600000-12600fff : ARMH0011:00
12600000-12600fff : ARMH0011:00
12610000-12610fff : ARMH0011:01
12610000-12610fff : ARMH0011:01
126b0000-126b0fff : APMC0D0F:00
126b0000-126b0fff : APMC0D0F:00
126f0000-126f0fff : APMC0D81:00
126f0000-126f0fff : APMC0D81:00
12730000-12730fff : arch_mem_timer
127c0000-127c0fff : sbsa-gwdt.0
127c0000-127c0fff : sbsa-gwdt.0
127d0000-127d0fff : sbsa-gwdt.0
127d0000-127d0fff : sbsa-gwdt.0
13800000-138fffff : 808622B7:00
13800000-138fffff : 808622B7:00
13900000-139fffff : 808622B7:01
13900000-139fffff : 808622B7:01
14000000-140fffff : arm-smmu.0.auto
14000000-140fffff : arm-smmu.0.auto
15000000-150fffff : arm-smmu.1.auto
15000000-150fffff : arm-smmu.1.auto
1c000000-1c000fff : APMC0D33:00
1c000000-1c000fff : APMC0D33:00
1c100000-1c100fff : APMC0D33:01
1c100000-1c100fff : APMC0D33:01
78810000-78810fff : APMC0D83:00
78810000-78810fff : APMC0D83:00
7e200000-7e200fff : APMC0D83:00
7e200000-7e200fff : APMC0D83:00
7e810000-7e810fff : APMC0D84:00
7e810000-7e810fff :
7e830000-7e830fff : APMC0D84:01
7e830000-7e830fff :
7e850000-7e850fff : APMC0D84:02
7e850000-7e850fff :
7e870000-7e870fff : APMC0D84:03
7e870000-7e870fff :
7e890000-7e890fff : APMC0D84:04
7e890000-7e890fff :
7e8b0000-7e8b0fff : APMC0D84:05
7e8b0000-7e8b0fff :
7e8d0000-7e8d0fff : APMC0D84:06
7e8d0000-7e8d0fff :
7e8f0000-7e8f0fff : APMC0D84:07
7e8f0000-7e8f0fff :
7e910000-7e910fff : APMC0D87:00
7e910000-7e910fff :
7e930000-7e930fff : APMC0D87:01
7e930000-7e930fff :
7ea50000-7ea50fff : APMC0D88:00
7ea50000-7ea50fff :
7ead0000-7ead0fff : APMC0D88:01
7ead0000-7ead0fff :
7eb50000-7eb50fff : APMC0D88:02
7eb50000-7eb50fff :
7ebd0000-7ebd0fff : APMC0D88:03
7ebd0000-7ebd0fff :
7ec50000-7ec50fff : APMC0D88:04
7ec50000-7ec50fff :
7ecd0000-7ecd0fff : APMC0D88:05
7ecd0000-7ecd0fff :
7ed50000-7ed50fff : APMC0D88:06
7ed50000-7ed50fff :
7edd0000-7edd0fff : APMC0D88:07
7edd0000-7edd0fff :
90000000-91ffffff : System RAM
92000000-928bffff : reserved
928c0000-fff7ffff : System RAM
92a80000-93b6ffff : Kernel code
942c0000-94f0ffff : Kernel data
eee00000-ffdfffff : Crash kernel
fff80000-ffffffff : reserved
400000000-40fffffff : PCI ECAM
430000000-4efffffff : PCI Bus 0007:00
430000000-4317fffff : PCI Bus 0007:01
430000000-4317fffff : PCI Bus 0007:02
430000000-430ffffff : 0007:02:00.0
431000000-43103ffff : 0007:02:00.0
431040000-43105ffff : 0007:02:00.0
500000000-5ffffffff : PCI Bus 0007:00
600000000-60fffffff : PCI ECAM
630000000-6efffffff : PCI Bus 0006:00
630000000-6302fffff : PCI Bus 0006:01
630000000-6300fffff : 0006:01:00.0
630000000-6300fffff : igb
630100000-6301fffff : 0006:01:00.0
630200000-630203fff : 0006:01:00.0
630200000-630203fff : igb
700000000-7ffffffff : PCI Bus 0006:00
880000000-fffffffff : System RAM
1000000000-100fffffff : PCI ECAM
1030000000-10efffffff : PCI Bus 0002:00
1100000000-57ffffffff : PCI Bus 0002:00
5800000000-580fffffff : PCI ECAM
5830000000-58efffffff : PCI Bus 0003:00
5900000000-5fffffffff : PCI Bus 0003:00
6000000000-600fffffff : PCI ECAM
6030000000-60efffffff : PCI Bus 0004:00
6100000000-6fffffffff : PCI Bus 0004:00
7000000000-700fffffff : PCI ECAM
7030000000-70efffffff : PCI Bus 0005:00
7100000000-77ffffffff : PCI Bus 0005:00
7800000000-780fffffff : PCI ECAM
7830000000-78efffffff : PCI Bus 0001:00
7900000000-7fffffffff : PCI Bus 0001:00
8800000000-bff12dffff : System RAM
bff12e0000-bff13dffff : reserved
bff13e0000-bff17fffff : System RAM
bff1800000-bff180ffff : reserved
bff1810000-bff23cffff : System RAM
bff23d0000-bff23dffff : reserved
bff23e0000-bff68fffff : System RAM
bff6900000-bff690ffff : reserved
bff6910000-bff801ffff : System RAM
bff8020000-bff849ffff : reserved
bff84a0000-bff856ffff : System RAM
bff8570000-bff85bffff : reserved
bff85c0000-bff8b4ffff : System RAM
bff8b50000-bff8b6ffff : reserved
bff8b70000-bff8b8ffff : System RAM
bff8b90000-bff8baffff : reserved
bff8bb0000-bff8bcffff : System RAM
bff8bd0000-bff8bdffff : reserved
bff8be0000-bffad8ffff : System RAM
bffad90000-bffe19ffff : reserved
bffe1a0000-bfffc9ffff : System RAM
bfffca0000-bfffccffff : reserved
bfffcd0000-bfffd2ffff : System RAM
bfffd30000-bfffd8ffff : reserved
bfffd90000-bfffffffff : System RAM
10000000000-1000fffffff : PCI ECAM
10030000000-100efffffff : PCI Bus 0000:00
10030000000-100301fffff : PCI Bus 0000:01
10030000000-100300fffff : 0000:01:00.0
10030100000-1003013ffff : 0000:01:00.0
10030140000-1003014ffff : 0000:01:00.0
10030140000-1003014ffff : megasas: LSI
Here is my memory map from dmesg from 1 type of machine :
[ 0.000000] NUMA: NODE_DATA [mem 0xbfffffe180-0xbfffffffff]
[ 0.000000] Zone ranges:
[ 0.000000] DMA [mem 0x0000000090000000-0x00000000ffffffff]
[ 0.000000] Normal [mem 0x0000000100000000-0x000000bfffffffff]
[ 0.000000] Movable zone start for each node
[ 0.000000] Early memory node ranges
[ 0.000000] node 0: [mem 0x0000000090000000-0x0000000091ffffff]
[ 0.000000] node 0: [mem 0x0000000092000000-0x00000000928bffff]
[ 0.000000] node 0: [mem 0x00000000928c0000-0x00000000fff7ffff]
[ 0.000000] node 0: [mem 0x00000000fff80000-0x00000000ffffffff]
[ 0.000000] node 0: [mem 0x0000000880000000-0x0000000ffffffffff]
The maps vary on different flavors of server class equipment. It is
vendor specific.
--pk
_______________________________________________
kexec mailing list
kexec@xxxxxxxxxxxxxxxxxxx
https://urldefense.com/v3/__http://lists.infradead.org/mailman/listinfo/kexec__;!!GqivPVa7Brio!PWa-7CQ5Hx7dC_Aih8VYL9Fi6RZFuoTN9wYtbBCiUoStUuwwhNeaaXaGe5BfV3FbqPg4$
--
Thank You,
John
_______________________________________________
kexec mailing list
kexec@xxxxxxxxxxxxxxxxxxx
http://lists.infradead.org/mailman/listinfo/kexec