Hi Chen, On Wed, Jul 29, 2020 at 11:52:39AM +0800, chenzhou wrote: > On 2020/7/28 1:30, Catalin Marinas wrote: > > Anyway, there are two series solving slightly different issues with > > kdump reservations: > > > > 1. This series which relaxes the crashkernel= allocation to go anywhere > > in the accessible space while having a dedicated crashkernel=X,low > > option for ZONE_DMA. > > > > 2. Bhupesh's series [1] forcing crashkernel=X allocations only from > > ZONE_DMA. > > > > For RPi4 support, we limited ZONE_DMA allocations to the 1st GB. > > Existing crashkernel= uses may no longer work, depending on where the > > allocation falls. Option (2) above is a quick fix assuming that the > > crashkernel reservation is small enough. What's a typical crashkernel > > option here? That series is probably more prone to reservation failures. > > > > Option (1), i.e. this series, doesn't solve the problem raised by > > Bhupesh unless one uses the crashkernel=X,low argument. It can actually > > make it worse even for ZONE_DMA32 since the allocation can go above 4G > > (assuming that we change the ZONE_DMA configuration to only limit it to > > 1GB on RPi4). > > > > I'm more inclined to keep the crashkernel= behaviour to ZONE_DMA > > allocations. If this is too small for typical kdump, we can look into > > expanding ZONE_DMA to 4G on non-RPi4 hardware (we had patches on the > > list). In addition, if Chen thinks allocations above 4G are still needed > > or if RPi4 needs a sufficiently large crashkernel=, I'd rather have a > > ",high" option to explicitly require such access. > > Thanks for your reply and exhaustive explanation. > > In our ARM servers, we need to to reserve a large chunk for kdump(512M > or 1G), there is no enough low memory. So we proposed this patch > series "support reserving crashkernel above 4G on arm64 kdump" In > April 2019. Trying to go through the discussions last year, hopefully things get clearer. So prior to the ZONE_DMA change, you still couldn't reserve 1G in the first 4GB? It shouldn't be sparsely populated during early boot. > I introduce parameters "crashkernel=X,[high,low]" as x86_64 does in earlier versions. > Suggested by James, to simplify, we call reserve_crashkernel_low() at the beginning of > reserve_crashkernel() and then relax the arm64_dma32_phys_limit if reserve_crashkernel_low() > allocated something. > That is, just the parameter "crashkernel=X,low" is ok and i deleted "crashkernel=X,high". The problem I see is that with your patches we diverge from x86 behaviour (and the arm64 behaviour prior to the ZONE_DMA reduction) as we now require that crashkernel=X,low is always passed if you want something in ZONE_DMA (and you do want, otherwise the crashdump kernel fails to boot). My main requirement is that crashkernel=X, without any suffix, still works which I don't think is guaranteed with your patches (well, ignoring RPi4 ZONE_DMA). Bhupesh's series is a quick fix but doesn't solve your large allocation requirements (that may have worked prior to the ZONE_DMA change). > After the ZONE_DMA introduced in December 2019, the issue occurred as > you said above. In fact, we didn't have RPi4 machine. You don't even need to have a RPi4 machine, ZONE_DMA has been set to 1GB unconditionally. And while we could move it back to 4GB on non-RPi4 hardware, I'd rather have a solution that fixes kdump for RPi4 as well. > Originally, i suggested to fix this based on this patch series and > used the dedicated option. > > According to your clarify, for typical kdump, there are other > solutions. In this case, "keep the crashkernel= behaviour to ZONE_DMA > allocations" looks much better. > > How about like this: > 1. For ZONE_DMA issue, use Bhupesh's solution, keep the crashkernel= > behaviour to ZONE_DMA allocations. > 2. For this patch series, make the reserve_crashkernel_low() to > ZONE_DMA allocations. So you mean rebasing your series on top of Bhupesh's? I guess you can combine the two, I really don't care which way as long as we fix both issues and agree on the crashkernel= semantics. I think with some tweaks we can go with your series alone. IIUC from the x86 code (especially the part you #ifdef'ed out for arm64), if ",low" is not passed (so just standard crashkernel=X), it still allocates sufficient low memory for the swiotlb in ZONE_DMA. The rest can go in a high region. Why can't we do something similar on arm64? Of course, you can keep the ",low" argument for explicit allocation but I don't want to mandate it. So with an implicit ZONE_DMA allocation similar to the x86 one, we probably don't need Bhupesh's series at all. In addition, we can limit crashkernel= to the first 4G with a fall-back to high like x86 (not sure if memblock_find_in_range() is guaranteed to search in ascending order). I don't think we need an explicit ",high" annotation. So with the above, just a crashkernel=1G gives you at least 256MB in ZONE_DMA followed by the rest anywhere, with a preference for ZONE_DMA32. This way we can also keep the reserve_crashkernel_low() mostly intact from x86 (less #ifdef's). Do I miss anything? -- Catalin