Re: [PATCH] arm64/mm: Introduce a variable to hold base address of linear region

Bhupesh Sharma <bhsharma@xxxxxxxxxx> · Tue, 19 Jun 2018 16:07:57 +0530

Hi James,

On Tue, Jun 19, 2018 at 3:46 PM, James Morse <james.morse@xxxxxxx> wrote:
> Hi Yanjiang, Will,
>
> On 19/06/18 10:57, Jin, Yanjiang wrote:
>>> -----Original Message-----
>>> From: Will Deacon [mailto:will.deacon@xxxxxxx]
>>> Sent: 2018年6月19日 17:41
>>> To: Jin, Yanjiang <yanjiang.jin@xxxxxxxxxxxxxxxx>
>>> Cc: James Morse <james.morse@xxxxxxx>; Bhupesh Sharma
>>> <bhsharma@xxxxxxxxxx>; Mark Rutland <mark.rutland@xxxxxxx>; Ard
>>> Biesheuvel <ard.biesheuvel@xxxxxxxxxx>; Catalin Marinas
>>> <catalin.marinas@xxxxxxx>; Kexec Mailing List <kexec@xxxxxxxxxxxxxxxxxxx>;
>>> AKASHI Takahiro <takahiro.akashi@xxxxxxxxxx>; Bhupesh SHARMA
>>> <bhupesh.linux@xxxxxxxxx>; linux-arm-kernel <linux-arm-
>>> kernel@xxxxxxxxxxxxxxxxxxx>
>>> Subject: Re: [PATCH] arm64/mm: Introduce a variable to hold base address of
>>> linear region
>>>
>>> On Tue, Jun 19, 2018 at 09:34:56AM +0000, Jin, Yanjiang wrote:
>>>>> On Tue, Jun 19, 2018 at 03:02:15AM +0000, Jin, Yanjiang wrote:
>>>>>>> You seem to be using this for user-space phys_to_virt() based on
>>>>>>> values found in /proc/iomem. This should give you what you want,
>>>>>>> and isolate your user-space from the kernel's unexpected naming of
>>> variables.
>>>>>>
>>>>>> I don't know could I simplify this problem?
>>>>>> Let's ignore what memstart_addr represents here, we just want to
>>>>>> implement
>>>>>> phys_to_virt() in an userspace applications(kexec-tools or others).
>>>>>>
>>>>>> ARM64 Kernel has a below definition:
>>>>>>
>>>>>> #define __phys_to_virt(x)       ((unsigned long)((x) - PHYS_OFFSET) |
>>>>> PAGE_OFFSET)
>>>>>>
>>>>>> So userspace app must know PHYS_OFFSET(equal to memstart_addr now).
>>>>>> Seems this is very simple, but memstart_addr has gone through
>>>>>> several operations in arm64_memblock_init() depends on different
>>>>>> Kernel configurations, so userspace app needs to know many
>>>>>> additional definitions as
>>>>> following:
>>>>>>
>>>>>> memblock_start_of_DRAM(),  (ifdef CONFIG_SPARSEMEM_VMEMMAP),
>>>>>> ARM64_MEMSTART_SHIFT,  SECTION_SIZE_BITS,  PAGE_OFFSET,
>>>>>> memblock_end_of_DRAM(), IS_ENABLED(CONFIG_RANDOMIZE_BASE),
>>>>>> memstart_offset_seed.
>>>>>>
>>>>>> It is hard to know all above in kexec-tools now. Originally I
>>>>>> planned to read memstart_addr's value from "/dev/mem", but someone
>>>>>> thought not all Kernels enable "/dev/mem", we'd better find a more
>>>>>> generic approach. So we want to get some suggestions from ARM kernel
>>> community.
>>>>>> Can we export this variable in Kernel side through sysconf() or
>>>>>> other similar methods? Or someone can provide an effect way to get
>>>>>> memstart_addr's value?
>>>>>
>>>>> I thought the suggestion from James was to expose this via an ELF
>>>>> NOTE in kcore and vmcore (or in the header directly if that's possible, but I'm
>>> not sure about it)?
>>>>
>>>> Thanks for your reply firstly. But same as DEVMEM, kcore is not a
>>>> must-have, so we can't depend on it.
>>>
>>> Neither is KEXEC. We can select PROC_KCORE from KEXEC if it helps.
>>>
>>>> On the other hand, phys_to_virt() is called during generating vmcore
>>>> in Kexec-tools, vmcore also can't help this issue.
>>>
>>> I don't understand this part. If you have the vmcore in your hand, why can't you
>>> grok the pv offset from the note and use that in phys_to_virt()?
>>
>> It is a chicken-and-egg issue.
>> phys_to virt() is for crashdump setup. To generate vmcore, we must call
>> phys_to_virt(). At this point, no vmcore exists.
>
> Its needed for the parts of the ELF header that kexec-tools generates at kdump
> load time?
>
> So adding this pv_offset to the key=value data crash_save_vmcoreinfo_init()
> saves isn't available early enough?

Yes, one case where it is not actually available early enough for
makedumpfile usage is if we are determining the PT_NOTE contents from
the '/proc/kcore' on a 'live' system

See <https://github.com/bhupesh-sharma/makedumpfile/blob/devel/elf_info.c#L375>
for example:

int set_kcore_vmcoreinfo(uint64_t vmcoreinfo_addr, uint64_t vmcoreinfo_len)

{

<snip..>
kvaddr = (ulong)vmcoreinfo_addr + PAGE_OFFSET;

}

Now the problem at hand is to determine the offset at which the
pv_offset (key=value data pair) lies in the '/proc/kcore' (I assume
that when you mentioned above and earlier about adding this pair to
the elfnotes you meant both the vmcoreinfo and 'proc/kcore'), as we
can have 'n' number of PT_LOAD segments.

So, we have a chicken and egg situation in such case(s). Do you have
any pointers on how we can fix such use-cases.

Thanks,
Bhupesh

> If we select PROC_KCORE for KEXEC so you know you will have /proc/kcore if the
> system supports kdump. We should probably provide the same information in the
> PT_NOTE section of the /proc/kcore file.
>
>
> (I thought the kdump kernel exported that crash_save_vmcoreinfo_init() data as
> an elf-note itself, but digging deeper I see the kernel exposes the physical
> address in /sys/kernel/vmcoreinfo. Presumably its passed back via the kdump
> elfcorehdr.)
>
>
>>>> Unfortunately, not all platforms support analyzing Kernel config in
>>>> userspace application, so Kexec-tools can't know some key kernel options.
>>>> If not so, we can simulate the whole arm64_memblock_init()  progress
>>>> in kexec-tools.
>>>
>>> I don't understand what the kernel config has to do with kexec tools.
>>
>> I mean that if we can know kernel .config in all circumstances, we can calculate memstart_addr  as below in Kexec-tools:
>>
>>
>>         memstart_addr = round_down(memblock_start_of_DRAM(),
>>                                    ARM64_MEMSTART_ALIGN);
>
> This wouldn't work for KASLR. Having the kernel provide you with the offset
> means you are insulated from the details of phys_to_virt() and what affects
> these values. It should be possible to do this in the same way for all
> architectures.
>
>
> Thanks,
>
> James

_______________________________________________
kexec mailing list
kexec@xxxxxxxxxxxxxxxxxxx
http://lists.infradead.org/mailman/listinfo/kexec