Hi James, All,
On 01/31/2019 03:09 AM, Bhupesh Sharma wrote:
Hi James,
Thanks for review.
Please see my comments inline.
On 01/30/2019 08:51 PM, James Morse wrote:
Hi Bhupesh,
On 01/30/2019 12:23 PM, Bhupesh Sharma wrote:
With ARMv8.2-LVA and LPA architecture extensions, arm64 hardware which
supports these extensions can support upto 52-bit virtual and 52-bit
physical addresses respectively.
Since at the moment we enable the support of these extensions via CONFIG
flags, e.g.
- LPA via CONFIG_ARM64_PA_BITS_52
there are no clear mechanisms in user-space right now to
deteremine these CONFIG flag values and also determine the PARange and
VARange address values.
User-space tools like 'makedumpfile' and 'crash-utility' can instead
use the 'MAX_USER_VA_BITS' and 'MAX_PHYSMEM_BITS' values to determine
the maximum virtual address and physical address (respectively)
supported by underlying kernel.
A reference 'makedumpfile' implementation which uses this approach to
determining the maximum physical address is available in [0].
Why does it need to know?
(Suzuki asked the same question on your earlier version)
https://lore.kernel.org/linux-arm-kernel/cff44754-7fe4-efea-bc8e-4dde2277c821@xxxxxxx/
I have shared some details (after discussion with our test teams) in
reply to the review comments from Suzuki here:
http://lists.infradead.org/pipermail/kexec/2019-January/022389.html, and
http://lists.infradead.org/pipermail/kexec/2019-January/022390.html
Just to summarize, I mentioned in my replies to the review comments tha
the makedumpfile implementation (for decoding the PTE) was just as an
example, however there can be other user-space applications, for e.g a
user-space application running with 48-bit kernel VA and 52-bit user
space VA and requesting allocation in 'high' address via a 'hint' to mmap.
From your github link it looks like you use this to re-assemble the
two bits of the PFN from the pte. Can't you always do this for 64K
pages? CPUs with the feature always do this too, its not something the
kernel turns on.
Ok, let me try to give some perspective of a common makedumpfile
use-case before I jump into the details:
(a) makedumpfile tool can be used to generate a vmcore and analyze it
later. So for example we can create vmcore for a system running with
page-size = 64K and analyze it later on a different system using
page-size = 4K.
Since several makedumpfile code legs (for page-table walk) are common in
both the paths (creating a vmcore and analyzing a vmcore), we cannot
hardcode the PTE calculation masks for either 48-bit or 52-bit address
spaces (or 4K/64K page sizes). The example invocations for the two cases
is given below:
Create a vmcore dump on a 64K machine:
# makedumpfile -l --message-level 1 -d 31 /proc/vmcore vmcore
Analyze the vmcore dump on a 4K machine:
# makedumpfile -d 31 -x vmlinux vmcore dumpfile
Also hardcoding the PTE calculation to use the high address bit mask
always will break the backward compatibility with older kernels (which
don't support 52-bit address space extensions).
(b). Also x86_64 already has a vmcoreinfo export for 'pgtable_l5_enabled':
void arch_crash_save_vmcoreinfo(void)
{
<.. snip..>
vmcoreinfo_append_str("NUMBER(pgtable_l5_enabled)=%d\n",
pgtable_l5_enabled());
}
And the makedumpfile code uses the same to determine support for 5-level
page tables in x86_64, see
<https://github.com/bhupesh-sharma/makedumpfile/blob/52-bit-pa-support-via-vmcore-v1/arch/x86_64.c#L36>
for example.
Ping. Since this patch fixes a regression with user-space tools like
makedumpfile and crash-utility which are broken since arm64 kernels
with 52-bit VA and PA support are available (and distributions which
enable them), would request review comments/ack on this simple change.
Thanks,
Bhupesh
diff --git a/arch/arm64/kernel/crash_core.c
b/arch/arm64/kernel/crash_core.c
index ca4c3e12d8c5..ad231be5c0d8 100644
--- a/arch/arm64/kernel/crash_core.c
+++ b/arch/arm64/kernel/crash_core.c
@@ -10,6 +10,8 @@
void arch_crash_save_vmcoreinfo(void)
{
VMCOREINFO_NUMBER(VA_BITS);
+ VMCOREINFO_NUMBER(MAX_USER_VA_BITS);
+ VMCOREINFO_NUMBER(MAX_PHYSMEM_BITS);
/* Please note VMCOREINFO_NUMBER() uses "%d", not "%x" */
vmcoreinfo_append_str("NUMBER(kimage_voffset)=0x%llx\n",
kimage_voffset);
_______________________________________________
kexec mailing list
kexec@xxxxxxxxxxxxxxxxxxx
http://lists.infradead.org/mailman/listinfo/kexec