Re: [PATCH][v2] arm64: Allocate elfcorehdr & crashkernel mem before any reservation

James Morse <james.morse@xxxxxxx> · Fri, 19 Jan 2018 12:16:31 +0000

Hello,

On 16/01/18 07:07, takahiro.akashi@xxxxxxxxxx wrote:
> On Mon, Jan 15, 2018 at 10:14:05AM +0530, Bhupesh SHARMA wrote:
>> On Sat, Jan 13, 2018 at 8:37 AM, Poonam Aggrwal <poonam.aggrwal@xxxxxxx> wrote:
>>>> On 08/01/18 04:31, Poonam Aggrwal wrote:
>>>>> Yeah, this is a good point. So ideally the address of the crash kernel
>>>>> should be diligently provided by the user based on the system.

>>>> Even better: the region to store the crash kernel in should be chosen by the
>>>> kernel.
>>>> When using kdump I boot with 'crashkernel=1G', the kernel chooses where to
>>>> place the reserved region.
>>>> Even if I specified a reasonable physical address, the
>>>> efistub may relocate the kernel over the top during boot as part of its KASLR
>>>> work.

>>> Agree
>>>>
>>>> (Why does anyone ever need to specify an offset here?)
>>> offset is normally an optional argument. Request Takahiro to provide his inputs.  Does this imply any updates in kexec design/implementation/documentation?
>>
>> offset is a optional argument. For relocatable kernels (and kernels
>> which support KASLR), specifying offset is normally not needed.
>>
>> Please refer to the 'Extended crashkernel syntax' documentation
>> (<https://github.com/torvalds/linux/blob/master/Documentation/kdump/kdump.txt#L259>):
>>
>> Extended crashkernel syntax
>> ===========================
>>
>> While the "crashkernel=size[@offset]" syntax is sufficient for most
>> configurations, sometimes it's handy to have the reserved memory dependent
>> on the value of System RAM -- that's mostly for distributors that pre-setup
>> the kernel command line to avoid a unbootable system after some memory has
>> been removed from the machine.

This is to let the distribution provide a single value that works on machines
with very little RAM, and machines that need gigabytes in order to boot.

Where does specifying the offset/absolute-address come in?

>> As James mentioned for arm64, in case of relocatable/kaslr kernels,
>> the efistub may relocate the kernel over the top during boot as part
>> of its KASLR.
> 
> It would be sad if we couldn't specify kaslr and kdump at the same time.

We can. The problem comes when you specify an absolute-address that should be
reserved for user-space to eventually load the kdump kernel into. This is
fragile for a number of reasons.

> Since kaslr will skip any of memory regions whose attributes are not
> CONVENTIONAL_MEMORY for allocating a relocated kernel image, we will be
> able to have a dedicated range of memory reserved for kdump.
> In this case, using an "offset" in "crashkernel=" will be crucial.
> 
> (I don't know how we can notify uefi of the region though.)

I'm confused. panic()->kdump:boot doesn't go via UEFI. It passes information in
the DT:/chosen that may have been generated by the EFIStub, but it doesn't (and
must not) change the EFI memory map.

We need to decide where the crashdump kernel region is when the first-kernel
generates its page tables, as the protect/unprotect mechanism wants to be able
to unmap them.
There is no reason for UEFI to know about the kdump region, BootServices are
long-gone by the time its location has to be decided.

>> So, the offset field may make more sense for
>> non-relocatable/static kernels, but for newer kernels, its better to
>> use the 'Extended crashkernel syntax' syntax which is also supported
>> by newer distribution versions.

(this extended crashkernel syntax looks like a tangent: its about specifying
one-value string to specify a reservation-size on both small-memory and
large-memory machines).

I don't think 'relocatable kernels' are relevant here. The KASLR series changed
the kernel to no longer run from the linear map, so where in the linear map we
allocate memory for the crash-kernel to boot from can't matter. These changes
were merged before kexec/kdump support was added.

Even before this change, (I recall that:) the kernel would discard memory below
its text. This isn't a problem as kexec-tools (typically) locates the kernel at
the bottom of the region, and physical memory outside this range isn't
accessible anyway because of the "linux,usable-memory" property. When we do want
to access it, we remap it using the vmcore helpers.

I think this @offset must be for kernels that have to run from a
physical/virtual address that is known at compile time. We don't have this
problem on arm64, and specifying @offset makes kdump less reliable:
| cannot reserve crashkernel: region overlaps reserved memory

> How better is it for the case?
> 
> I don't know exactly what you mean by "newer kernel/distribution", but
> kdump on arm64 supports this feature from the day one.
> (It is basically independent from architectures.)

Support @offset? Yes, its core code allowing this.

>> For e.g. see ubuntu trusty kdump-config man page -
>> <http://manpages.ubuntu.com/manpages/trusty/man8/kdump-config.8.html>:
>>
>>   kdump kernel relocation address does not match crashkernel= parameter:
>>               For non-relocatable architectures,  the  kdump  kernel  must  be
>>               built   with   a  predetermined  start  address.   This  message
>>               indicates that the start address of the  kdump  kernel  and  the
>>               start address in the crashkernel= parameter do not match.

arm64 doesn't have this issue. The 'predetermined start address' is a relative
value stored in the header. This is so the bootloader can place the kernel
anywhere in memory and still have it boot.

I think we're in the weeds here: adding @offset to your 'crashkernel=' cmdline
option tells the kernel you know this address will be free and not-reserved when
it comes to reservation time. This isn't generally true.
Unless you wrote the DT and the bootloader, you can't know this.

Wasn't this patch moving the elfcorehdr reservation up to be before any dynamic
reservations, to prevent them overlapping?

Thanks,

James

_______________________________________________
kexec mailing list
kexec@xxxxxxxxxxxxxxxxxxx
http://lists.infradead.org/mailman/listinfo/kexec