Re: [PATCH] arm64, vmcoreinfo : Append 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' to vmcoreinfo

Bhupesh Sharma <bhsharma@xxxxxxxxxx> · Mon, 4 Feb 2019 20:05:41 +0530

Hi James, All,

On 01/31/2019 03:09 AM, Bhupesh Sharma wrote:
Hi James,

Thanks for review.
Please see my comments inline.


On 01/30/2019 08:51 PM, James Morse wrote:
Hi Bhupesh,

On 01/30/2019 12:23 PM, Bhupesh Sharma wrote:
With ARMv8.2-LVA and LPA architecture extensions, arm64 hardware which
supports these extensions can support upto 52-bit virtual and 52-bit
physical addresses respectively.

Since at the moment we enable the support of these extensions via CONFIG
flags, e.g.
  - LPA via CONFIG_ARM64_PA_BITS_52

there are no clear mechanisms in user-space right now to
deteremine these CONFIG flag values and also determine the PARange and
VARange address values.
User-space tools like 'makedumpfile' and 'crash-utility' can instead
use the 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' values to determine
the maximum virtual address and physical address (respectively)
supported by underlying kernel.

A reference 'makedumpfile' implementation which uses this approach to
determining the maximum physical address is available in [0].

Why does it need to know?

(Suzuki asked the same question on your earlier version)
https://lore.kernel.org/linux-arm-kernel/cff44754-7fe4-efea-bc8e-4dde2277c821@xxxxxxx/ 


I have shared some details (after discussion with our test teams) in 
reply to the review comments from Suzuki here:
http://lists.infradead.org/pipermail/kexec/2019-January/022389.html, and
http://lists.infradead.org/pipermail/kexec/2019-January/022390.html

Just to summarize, I mentioned in my replies to the review comments tha 
the makedumpfile implementation (for decoding the PTE) was just as an 
example, however there can be other user-space applications, for e.g a 
user-space application running with 48-bit kernel VA and 52-bit user 
space VA and requesting allocation in 'high' address via a 'hint' to mmap.

 From your github link it looks like you use this to re-assemble the 
two bits of the PFN from the pte. Can't you always do this for 64K 
pages? CPUs with the feature always do this too, its not something the 
kernel turns on.

Ok, let me try to give some perspective of a common makedumpfile 
use-case before I jump into the details:

(a) makedumpfile tool can be used to generate a vmcore and analyze it 
later. So for example we can create vmcore for a system running with 
page-size = 64K and analyze it later on a different system using 
page-size = 4K.

Since several makedumpfile code legs (for page-table walk) are common in 
both the paths (creating a vmcore and analyzing a vmcore), we cannot 
hardcode the PTE calculation masks for either 48-bit or 52-bit address 
spaces (or 4K/64K page sizes). The example invocations for the two cases 
is given below:

Create a vmcore dump on a 64K machine:
# makedumpfile -l --message-level 1 -d 31 /proc/vmcore vmcore

Analyze the vmcore dump on a 4K machine:
# makedumpfile -d 31 -x vmlinux vmcore dumpfile

Also hardcoding the PTE calculation to use the high address bit mask 
always will break the backward compatibility with older kernels (which 
don't support 52-bit address space extensions).

(b). Also x86_64 already has a vmcoreinfo export for 'pgtable_l5_enabled':

void arch_crash_save_vmcoreinfo(void)
{
     <.. snip..>
     vmcoreinfo_append_str("NUMBER(pgtable_l5_enabled)=%d\n",
             pgtable_l5_enabled());
}

And the makedumpfile code uses the same to determine support for 5-level 
page tables in x86_64, see 
<https://github.com/bhupesh-sharma/makedumpfile/blob/52-bit-pa-support-via-vmcore-v1/arch/x86_64.c#L36> 
for example.

Ping. Since this patch fixes a regression with user-space tools like 
makedumpfile and crash-utility which are broken since arm64 kernels 
with 52-bit VA and PA support are available (and distributions which 
enable them), would request review comments/ack on this simple change.

Thanks,
Bhupesh

diff --git a/arch/arm64/kernel/crash_core.c 
b/arch/arm64/kernel/crash_core.c
index ca4c3e12d8c5..ad231be5c0d8 100644
--- a/arch/arm64/kernel/crash_core.c
+++ b/arch/arm64/kernel/crash_core.c
@@ -10,6 +10,8 @@
  void arch_crash_save_vmcoreinfo(void)
  {
      VMCOREINFO_NUMBER(VA_BITS);
+    VMCOREINFO_NUMBER(MAX_USER_VA_BITS);
+    VMCOREINFO_NUMBER(MAX_PHYSMEM_BITS);
      /* Please note VMCOREINFO_NUMBER() uses "%d", not "%x" */
      vmcoreinfo_append_str("NUMBER(kimage_voffset)=0x%llx\n",
                          kimage_voffset);





_______________________________________________
kexec mailing list
kexec@xxxxxxxxxxxxxxxxxxx
http://lists.infradead.org/mailman/listinfo/kexec