Re: [PATCH v7 0/4] support reserving crashkernel above 4G on arm64 kdump

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2/12/20 8:10 PM, Chen Zhou wrote:
Hi John,

On 2020/2/12 21:20, John Donnelly wrote:
On 12/23/19 9:23 AM, Chen Zhou wrote:
This patch series enable reserving crashkernel above 4G in arm64.

There are following issues in arm64 kdump:
1. We use crashkernel=X to reserve crashkernel below 4G, which will fail
when there is no enough low memory.
2. Currently, crashkernel=Y@X can be used to reserve crashkernel above 4G,
in this case, if swiotlb or DMA buffers are required, crash dump kernel
will boot failure because there is no low memory available for allocation.

To solve these issues, introduce crashkernel=X,low to reserve specified
size low memory.
Crashkernel=X tries to reserve memory for the crash dump kernel under
4G. If crashkernel=Y,low is specified simultaneously, reserve spcified
size low memory for crash kdump kernel devices firstly and then reserve
memory above 4G.



Hi Chen,


I've applied your V7 patches to 5.4.17 test kernel and I am still seeing
failures when I use a kdump kernel .


On the kernel boot I see:

  Reserving 250MB of low memory at 3618MB for crashkernel (System low RAM: 2029MB)
  crashkernel reserved: 0x00000008c0000000 - 0x0000000940000000 (2048 MB)

# cat /proc/iomem  | grep -i cras
   e2200000-f1bfffff : Crash kernel (low)
   8c0000000-93fffffff : Crash kernel


When kdump kernel is started I see what appears to be DMA initialized :

NUMA: NODE_DATA(1) on node 0
Zone ranges:
DMA32    [mem 0x00000000802f0000-0x00000000ffffffff]
Normal   [mem 0x0000000100000000-0x000000093fffffff]

But the sas driver still fails :


[   12.092769] CPU: 0 PID: 149 Comm: kworker/0:13 Not tainted 5.4.17-4-uek6m_ol8-jpdonnel+ #2
[   12.101019] Hardware name: To be filled by O.E.M. Saber/Saber, BIOS 0ACKL028 09/09/2019
[   12.109019] Workqueue: events work_for_cpu_fn
[   12.113363] Call trace:
[   12.115803]  dump_backtrace+0x0/0x19c
[   12.119453]  show_stack+0x24/0x2c
[   12.122769]  dump_stack+0xcc/0xf8
[   12.126078]  warn_alloc+0x108/0x11c
[   12.129554]  __alloc_pages_slowpath+0x8fc/0xa10
[   12.134071]  __alloc_pages_nodemask+0x2ec/0x334
[   12.138597]  __dma_direct_alloc_pages+0x19c/0x238
[   12.143288]  dma_direct_alloc_pages+0x48/0xf8
[   12.147632]  dma_direct_alloc+0x4c/0x6c
[   12.151455]  dma_alloc_attrs+0x88/0xf4
[   12.155196]  dma_pool_alloc+0x11c/0x2ec
[   12.159053]  _base_allocate_memory_pools+0x2ec/0x1078 [mpt3sas]
[   12.164978]  mpt3sas_base_attach+0x444/0x748 [mpt3sas]
[   12.170121]  _scsih_probe+0x554/0x848 [mpt3sas]
[   12.174648]  local_pci_probe+0x4c/0x98

And the kdump fails to find local storage:


  mpt3sas_cm0: reply_post_free pool: dma_pool_alloc failed
  mpt3sas_cm0: failure at ../drivers/scsi/mpt3sas/mpt3sas_scsih.c:10626/_scsih_probe()!




When crashkernel is reserved above 4G in memory, that is, crashkernel=X,low
is specified simultaneously, kernel should reserve specified size low memory
for crash dump kernel devices. So there may be two crash kernel regions, one
is below 4G, the other is above 4G.

   Can we consider a different name for "low" -  Like "dma" .. That is what it is intended for :

Ie: So it shows up as ;

# cat /proc/iomem  | grep -i cras
   e2200000-f1bfffff : Crash kernel (dma)
   8c0000000-93fffffff : Crash kernel



In order to distinct from the high region and make no effect to the use of
kexec-tools, rename the low region as "Crash kernel (low)", and add DT property
"linux,low-memory-range" to crash dump kernel's dtb to pass the low region.

Besides, we need to modify kexec-tools:
arm64: kdump: add another DT property to crash dump kernel's dtb(see [1])


Can you explain what needs done to kexec tools  in more detail ?

I'd like to understand why the Arm kdump boot images are so large ( 1024M+ ) as compared to x86 that can take a vmcore using a 512M kdump image .

As i said above, we also need to modify kexec-tools,
arm64: kdump: add another DT property to crash dump kernel's dtb(see [1]) is the patch.


Firstly, usable memory of crash dump kernel is passed via DT property, which is done by kexec-tools.

Currently, there is only one crash kernel region on arm64, which is passed by DT property "linux,usable-memory-range",
We need to add another region "crash kernel low" used for crash dump kernel devices, so kexec-tools need to add
another DT property "linux,low-memory-range" and then load crash kernel high.

More details can be retrieved from https://urldefense.com/v3/__http://lists.infradead.org/pipermail/kexec/2019-August/023569.html__;!!GqivPVa7Brio!I8OZuGITkxJtV3rT-AhRHey6R1A8sKmoPof2Ss5p9RdVPCQldpTfg-a_3SXv6sSNdSJQ$ .

Thanks,
Chen Zhou

Hi Chen,


I'll look at those changes.

Perhaps for Arm64 we hard code the dma range set up for POC bring-up if the crash kdump is utilizing memory above 4G. I don't see how an administrator is going to determine the value and size, do you ?
What value did you use as " crashkernel=xx,low " ?




I did an experiment today ..If you configure an Arm server class machine using " mem=896M maxcpus=1 " on the cmdline that confines the address space to below 1G, the system won't boot ..it suffers the same OoM issues that a kdump does as it boots to single user mode.
.








======= <clipped>=======


.



_______________________________________________
kexec mailing list
kexec@xxxxxxxxxxxxxxxxxxx
https://urldefense.com/v3/__http://lists.infradead.org/mailman/listinfo/kexec__;!!GqivPVa7Brio!I8OZuGITkxJtV3rT-AhRHey6R1A8sKmoPof2Ss5p9RdVPCQldpTfg-a_3SXv6tPhWi2g$



--
Thank You,
John

_______________________________________________
kexec mailing list
kexec@xxxxxxxxxxxxxxxxxxx
http://lists.infradead.org/mailman/listinfo/kexec



[Index of Archives]     [LM Sensors]     [Linux Sound]     [ALSA Users]     [ALSA Devel]     [Linux Audio Users]     [Linux Media]     [Kernel]     [Gimp]     [Yosemite News]     [Linux Media]

  Powered by Linux