On Fri, May 20, 2016 at 03:06:39PM -0400, Dave Anderson wrote: > > > Hi Takahiro, > > Welcome to the mailing list -- you are a most valuable addition. > > To others in the list, Takahiro and I have been communicating offline > for a couple weeks, and I convinced him to join us. He works on both > kexec-tools and the crash utility for Linaro on the ARM64 architecture. > If you are unaware, in Linux 4.6 there was a major change in the ARM64 > virtual memory layout, complicated by the layering of KASLR on top of it. > The new VM has broken crash utility support completely, and Takahiro is > tackling both. > > My comments and questions on the v1 patch follow... > > ----- Original Message ----- > > Hi, > > > > This patch is still rough-edged, but please review it and > > any comments are very welcome. > > I will try to fix the known issues before I submit a new > > version of kexec/kdump patch for v4.7 merge window. > > > > Thanks, > > -Takahiro AKASHI > > > > ===8<=== > > >From fdc7c881d98ef00ed1ff38a058b4913a1d5bcda6 Mon Sep 17 00:00:00 2001 > > From: AKASHI Takahiro <takahiro.akashi@xxxxxxxxxx> > > Date: Mon, 16 May 2016 17:31:55 +0900 > > Subject: [PATCH v1] arm64: fix kernel memory map handling for kaslr-enabled > > kernel > > > > In kernel v4.6, Kernel ASLR (KASLR) is supported on arm64, and the start > > address of the kernel image can be randomized if CONFIG_RANDOMIZE_BASE is enabled. > > Even worse, the kernel image is no more mapped in the linear mapping, but > > in vmalloc area (i.e. below PAGE_OFFSET). > > > > Now, according to the kernel's memory.h, converting a virtual address to > > a physical address should be done like below: > > > > phys_addr_t __x = (phys_addr_t)(x); \ > > __x & BIT(VA_BITS - 1) ? (__x & ~PAGE_OFFSET) + PHYS_OFFSET : \ > > (__x - kimage_voffset); }) > > > > Please note that PHYS_OFFSET is no more equal to the start address of > > the first usable memory block in SYSTEM RAM due to the fact mentioned > > above. > > So it is no longer possible to use /proc/iomem if KASLR is enabled > on a live system? That being the case, we need a way for root to > be able to determine what it is for live system analysis. Now that PHYS_OFFSET is defined as "memstart_addr", we can get the value if we can access this symbol (on a live system). > > > > This patch addresses this change and allows the crash utility to access > > memory contents with correct addresses. > > *However*, it is still rough-edged and we need more refinements. > > > > 1) Currently we have to specify the following command line: > > $ crash --kaslr auto > > --machdep phys_offset=XXX > > --machdep kimage_voffset=YYY > > <vmlinux> <vmcore> > > I presume that live system analysis requires the same extra arguments at this point? I'm still investigating a live-system case. > > > To remove these extra options, I'm planning to enhance my kdump patch > > so that /proc/vmcore holds the necessary information for analyzing the > > contents of kernel memory. > > > > 2) My current change breaks the backward compatibility. > > Crash won't work with v4.5 or early kernel partly because I changed > > the calculation of VA_BITS. (I think that the existing code is wrong.) > > So far I don't have enough time to look into this issue as I'm focusing > > on KASLR support. > > I don't understand why you think the existing code is wrong. It is > simply trying to determine what the value of CONFIG_ARM64_VA_BITS is. > For example, we (Red Hat) are currently using 4.5.0 based kernel, which > has this configuration: > > config-arm64:CONFIG_ARM64_VA_BITS=48 > > And that's what crash-7.1.5 calculates: > > crash> sys | grep RELEASE > RELEASE: 4.5.0-0.38.el7.aarch64 > crash> help -m > ... [ cut ] ... > VA_BITS: 48 > userspace_top: 0001000000000000 > page_offset: ffff800000000000 > vmalloc_start_addr: ffff000000000000 > vmalloc_end: ffff7fbfdffeffff > modules_vaddr: ffff7ffffc000000 > modules_end: ffff7fffffffffff > vmemmap_vaddr: ffff7fbfe0000000 > vmemmap_end: ffff7fffdfffffff > phys_offset: 4000000000 > ... > > I'm also presuming that your VA_BITS-related changes is what breaks > backwards-compatibility, because if I run "crash -d1" with your patch > applied, I see this debug output on the same kernel above: > > VA_BITS: 47 > > On the other hand, if I run your patch on a Fedora 4.6.0-0.rc7.git2.1.fc25 > live kernel, it comes up OK, and shows this: > > VA_BITS: 42 > userspace_top: 0000040000000000 > page_offset: fffffe0000000000 I don't still have time to look into this issue, but as far as those three values are concerned, they are consistent and VA_BITS seems to be correct. > vmalloc_start_addr: fffffc0008000000 > vmalloc_end: fffffdff5ffeffff > modules_vaddr: fffffc0000000000 > modules_end: fffffc0007ffffff > vmemmap_vaddr: fffffdff80000000 > vmemmap_end: fffffdffffffffff > phys_offset: 4000000000 > > KASLR is NOT configured, so perhaps that figures into why it works? > > > > 3) Backtracing a 'panic'ed task fails: > > crash> bt > > PID: 999 TASK: ffffffe960081800 CPU: 0 COMMAND: "sh" > > bt: WARNING: corrupt prstatus? pstate=0x80000000, but no user frame found > > bt: WARNING: cannot determine starting stack frame for task ffffffe960081800 > > > > frame->pc indicates that it is a kernel address, so obviously something > > is wrong. I'm diggin into it. This seems to a bug in my kdump patch. It mistakenly(?) returns spsr_el1 for regs->pstate, not current pstate, but we can't get current pstate directly on arm64. > OK... > > First, before posting, please kick off a "make warn" compilation in order to clean up > things like this: > > $ make warn > ... [ cut ] ... > cc -c -g -DARM64 -DLZO -DSNAPPY -DGDB_7_6 arm64.c -Wall -O2 -Wstrict-prototypes -Wmissing-prototypes -fstack-protector -Wformat-security > arm64.c:78:1: warning: no previous prototype for 'arm64_VTOP' [-Wmissing-prototypes] > arm64_VTOP(ulong addr) > ^ > arm64.c: In function 'arm64_parse_cmdline_args': > arm64.c:635:8: warning: unused variable 'value' [-Wunused-variable] > ulong value = 0; > ^ > arm64.c:631:19: warning: unused variable 'err' [-Wunused-variable] > int index, i, c, err; > ^ > arm64.c: In function 'arm64_init': > arm64.c:2539:17: warning: 'vmalloc_end' may be used uninitialized in this function [-Wmaybe-uninitialized] > vmemmap_start = vmalloc_end + SZ_64K; > ^ > arm64.c:2510:8: note: 'vmalloc_end' was declared here > ulong vmalloc_end; > ... > > As I mentioned in one of our earlier conversations, "vmalloc_end" is most definitely > being used uninitialized. Fixed all those warnings. > Patch comments are intermingled below: > > > > Signed-off-by: AKASHI Takahiro <takahiro.akashi@xxxxxxxxxx> > > --- > > arm64.c | 165 > > +++++++++++++++++++++++++++++++++++++++++++++----------------- > > defs.h | 22 ++++++--- > > main.c | 7 +-- > > symbols.c | 10 ++-- > > 4 files changed, 146 insertions(+), 58 deletions(-) > > > > diff --git a/arm64.c b/arm64.c > > index 34c8c59..22ddade 100644 > > --- a/arm64.c > > +++ b/arm64.c > > @@ -22,6 +22,8 @@ > > #include <endian.h> > > > > #define NOT_IMPLEMENTED(X) error((X), "%s: function not implemented\n", > > __func__) > > +/* FIXME: temporarily used in kt->flags2 */ > > +#define ARM64_NEW_VMEMMAP 0x80000000 > > Since ARM64_NEW_VMEMMAP is only used in arm64.c, please put it > in the machdep->flags field (under IRQ_STACKS): Yes. > > > > > static struct machine_specific arm64_machine_specific = { 0 }; > > static int arm64_verify_symbol(const char *, ulong, char); > > @@ -72,6 +74,21 @@ static int arm64_get_crash_notes(void); > > static void arm64_calc_VA_BITS(void); > > static int arm64_is_uvaddr(ulong, struct task_context *); > > > > +ulong > > +arm64_VTOP(ulong addr) > > +{ > > + if (!(kt->flags2 & ARM64_NEW_VMEMMAP) || > > + (addr >= machdep->machspec->page_offset)) { > > + return machdep->machspec->phys_offset > > + + (addr - machdep->machspec->page_offset); > > + } else { > > + if (machdep->machspec->kimage_voffset) > > + return addr - machdep->machspec->kimage_voffset; > > + else /* no randomness */ > > + return machdep->machspec->phys_offset > > + + (addr - machdep->machspec->vmalloc_start_addr); > > + } > > +} > > > > /* > > * Do all necessary machine-specific setup here. This is called several > > times > > @@ -105,6 +122,10 @@ arm64_init(int when) > > break; > > > > case PRE_GDB: > > + /* FIXME: Is kimage_voffset appropriate? */ > > + if (kernel_symbol_exists("kimage_voffset")) > > + kt->flags2 |= ARM64_NEW_VMEMMAP; > > + > > As far as I can tell, "kvimage_voffset" is perfect. Yes, I will remove "FIXME" comment. > > > if (!machdep->pagesize) { > > /* > > * Kerneldoc Documentation/arm64/booting.txt describes > > @@ -160,16 +181,33 @@ arm64_init(int when) > > machdep->pagemask = ~((ulonglong)machdep->pageoffset); > > > > arm64_calc_VA_BITS(); > > - machdep->machspec->page_offset = ARM64_PAGE_OFFSET; > > + ms = machdep->machspec; > > + ms->page_offset = ARM64_PAGE_OFFSET; > > machdep->identity_map_base = ARM64_PAGE_OFFSET; > > - machdep->machspec->userspace_top = ARM64_USERSPACE_TOP; > > - machdep->machspec->modules_vaddr = ARM64_MODULES_VADDR; > > - machdep->machspec->modules_end = ARM64_MODULES_END; > > - machdep->machspec->vmalloc_start_addr = ARM64_VMALLOC_START; > > - machdep->machspec->vmalloc_end = ARM64_VMALLOC_END; > > - machdep->kvbase = ARM64_VMALLOC_START; > > - machdep->machspec->vmemmap_vaddr = ARM64_VMEMMAP_VADDR; > > - machdep->machspec->vmemmap_end = ARM64_VMEMMAP_END; > > + ms->userspace_top = ARM64_USERSPACE_TOP; > > + if (kt->flags2 & ARM64_NEW_VMEMMAP) { > > + struct syment *sp; > > + > > + sp = kernel_symbol_search("_text"); > > + ms->kimage_text = (sp ? sp->value : 0); > > + sp = kernel_symbol_search("_end"); > > + ms->kimage_end = (sp ? sp->value : 0); > > + > > + ms->modules_vaddr = ARM64_VA_START; > > + if (kernel_symbol_exists("kasan_init")) > > + ms->modules_vaddr += ARM64_KASAN_SHADOW_SIZE; > > + ms->modules_end = ms->modules_vaddr > > + + ARM64_MODULES_VSIZE -1; > > + > > + ms->vmalloc_start_addr = ms->modules_end + 1; > > + } else { > > + ms->modules_vaddr = ARM64_PAGE_OFFSET - MEGABYTES(64); > > + ms->modules_end = ARM64_PAGE_OFFSET - 1; > > + ms->vmalloc_start_addr = ARM64_VA_START; > > + } > > + ms->vmalloc_end = ARM64_VMALLOC_END; > > + ms->vmemmap_vaddr = ARM64_VMEMMAP_VADDR; > > + ms->vmemmap_end = ARM64_VMEMMAP_END; > > > > switch (machdep->pagesize) > > { > > @@ -543,6 +581,42 @@ arm64_dump_machdep_table(ulong arg) > > } > > } > > Although you haven't changed arm64_dump_machdep_table(), this reminds > me: can you please display the 3 new machine_specific fields in > arm64_dump_machdep_table(). And also the ARM64_NEW_VMEMMAP bit after > you move it into the flags field? ("help -m" is your friend...) Yes. > > > +static int > > +arm64_parse_machdep_arg_l(char *argstring, char *param, ulong *value) > > +{ > > + int len; > > + int megabytes = FALSE; > > + char *p; > > + > > + len = strlen(param); > > + if (!STRNEQ(argstring, param) || (argstring[len] != '=')) > > + return FALSE; > > + > > + if ((LASTCHAR(argstring) == 'm') || > > + (LASTCHAR(argstring) == 'M')) { > > + LASTCHAR(argstring) = NULLCHAR; > > + megabytes = TRUE; > > + } > > + > > + p = argstring + len + 1; > > + if (strlen(p)) { > > + int flags = RETURN_ON_ERROR | QUIET; > > + int err = 0; > > + > > + if (megabytes) { > > + *value = dtol(p, flags, &err); > > + if (!err) > > + *value = MEGABYTES(*value); > > + } else { > > + *value = htol(p, flags, &err); > > + } > > + > > + if (!err) > > + return TRUE; > > + } > > + > > + return FALSE; > > +} > > > > /* > > * Parse machine dependent command line arguments. > > @@ -580,39 +654,23 @@ arm64_parse_cmdline_args(void) > > c = parse_line(buf, arglist); > > > > for (i = 0; i < c; i++) { > > - err = 0; > > - > > - if (STRNEQ(arglist[i], "phys_offset=")) { > > - int megabytes = FALSE; > > - int flags = RETURN_ON_ERROR | QUIET; > > - > > - if ((LASTCHAR(arglist[i]) == 'm') || > > - (LASTCHAR(arglist[i]) == 'M')) { > > - LASTCHAR(arglist[i]) = NULLCHAR; > > - megabytes = TRUE; > > - } > > - > > - p = arglist[i] + strlen("phys_offset="); > > - if (strlen(p)) { > > - if (megabytes) > > - value = dtol(p, flags, &err); > > - else > > - value = htol(p, flags, &err); > > - } > > - > > - if (!err) { > > - if (megabytes) > > - value = MEGABYTES(value); > > - > > - machdep->machspec->phys_offset = value; > > - > > - error(NOTE, > > - "setting phys_offset to: 0x%lx\n\n", > > - machdep->machspec->phys_offset); > > + if (arm64_parse_machdep_arg_l(arglist[i], > > + "phys_offset", > > + &machdep->machspec->phys_offset)) { > > + error(NOTE, > > + "setting phys_offset to: 0x%lx\n\n", > > + machdep->machspec->phys_offset); > > + > > + machdep->flags |= PHYS_OFFSET; > > + continue; > > + } else if (arm64_parse_machdep_arg_l(arglist[i], > > + "kimage_voffset", > > + &machdep->machspec->kimage_voffset)) { > > + error(NOTE, > > + "setting kimage_voffset to: 0x%lx\n\n", > > + machdep->machspec->kimage_voffset); > > > > - machdep->flags |= PHYS_OFFSET; > > - continue; > > - } > > + continue; > > } > > > > error(WARNING, "ignoring --machdep option: %s\n", > > @@ -2377,6 +2435,11 @@ arm64_IS_VMALLOC_ADDR(ulong vaddr) > > { > > struct machine_specific *ms = machdep->machspec; > > > > + if ((kt->flags2 & ARM64_NEW_VMEMMAP) && > > + (vaddr >= machdep->machspec->kimage_text) && > > + (vaddr <= machdep->machspec->kimage_end)) > > + return FALSE; > > + > > return ((vaddr >= ms->vmalloc_start_addr && vaddr <= > > ms->vmalloc_end) || > > ((machdep->flags & VMEMMAP) && > > (vaddr >= ms->vmemmap_vaddr && vaddr <= ms->vmemmap_end)) > > || > > @@ -2407,7 +2470,11 @@ arm64_calc_VA_BITS(void) > > > > for (bitval = highest_bit_long(value); bitval; bitval--) { > > if ((value & (1UL << bitval)) == 0) { > > +#if 1 > > + machdep->machspec->VA_BITS = bitval + 1; > > +#else /* FIXME: bug? */ > > machdep->machspec->VA_BITS = bitval + 2; > > +#endif > > Yeah, the new scheme needs further investigation... > > > > break; > > } > > } > > @@ -2459,10 +2526,20 @@ arm64_calc_virtual_memory_ranges(void) > > break; > > } > > > > - vmemmap_size = ALIGN((1UL << (ms->VA_BITS - machdep->pageshift)) * > > SIZE(page), PUD_SIZE); > > + if (kt->flags2 & ARM64_NEW_VMEMMAP) { > > +#define STRUCT_PAGE_MAX_SHIFT 6 > > + vmemmap_size = 1UL << (ms->VA_BITS - machdep->pageshift - 1 > > + + STRUCT_PAGE_MAX_SHIFT); > > + > > + vmemmap_start = ms->page_offset - vmemmap_size; > > + vmemmap_end = ms->page_offset; > > + } else { > > + vmemmap_size = ALIGN((1UL << (ms->VA_BITS - machdep->pageshift)) * > > SIZE(page), PUD_SIZE); > > + > > + vmemmap_start = vmalloc_end + SZ_64K; > > + vmemmap_end = vmemmap_start + vmemmap_size; > > + } > > vmalloc_end = (ms->page_offset - PUD_SIZE - vmemmap_size - SZ_64K); > > - vmemmap_start = vmalloc_end + SZ_64K; > > - vmemmap_end = vmemmap_start + vmemmap_size; > > > > ms->vmalloc_end = vmalloc_end - 1; > > ms->vmemmap_vaddr = vmemmap_start; > > diff --git a/defs.h b/defs.h > > index a09fa9a..40e02fc 100644 > > --- a/defs.h > > +++ b/defs.h > > @@ -2844,8 +2844,8 @@ typedef u64 pte_t; > > > > #define PTOV(X) \ > > ((unsigned > > long)(X)-(machdep->machspec->phys_offset)+(machdep->machspec->page_offset)) > > -#define VTOP(X) \ > > - ((unsigned > > long)(X)-(machdep->machspec->page_offset)+(machdep->machspec->phys_offset)) > > + > > +#define VTOP(X) arm64_VTOP((ulong)(X)) > > > > #define USERSPACE_TOP (machdep->machspec->userspace_top) > > #define PAGE_OFFSET (machdep->machspec->page_offset) > > @@ -2944,12 +2944,16 @@ typedef signed int s32; > > * arch/arm64/include/asm/memory.h > > * arch/arm64/include/asm/pgtable.h > > */ > > - > > -#define ARM64_PAGE_OFFSET ((0xffffffffffffffffUL) << > > (machdep->machspec->VA_BITS - 1)) > > +#define ARM64_VA_START ((0xffffffffffffffffUL) \ > > + << machdep->machspec->VA_BITS) > > +#define ARM64_PAGE_OFFSET ((0xffffffffffffffffUL) \ > > + << (machdep->machspec->VA_BITS - 1)) > > #define ARM64_USERSPACE_TOP ((1UL) << machdep->machspec->VA_BITS) > > -#define ARM64_MODULES_VADDR (ARM64_PAGE_OFFSET - MEGABYTES(64)) > > -#define ARM64_MODULES_END (ARM64_PAGE_OFFSET - 1) > > -#define ARM64_VMALLOC_START ((0xffffffffffffffffUL) << > > machdep->machspec->VA_BITS) > > + > > +/* only used for v4.6 or later */ > > +#define ARM64_MODULES_VSIZE MEGABYTES(128) > > +#define ARM64_KASAN_SHADOW_SIZE (1UL << (machdep->machspec->VA_BITS - 3)) > > + > > /* > > * The following 3 definitions are the original values, but are obsolete > > * for 3.17 and later kernels because they are now build-time calculations. > > @@ -3028,6 +3032,10 @@ struct machine_specific { > > ulong kernel_flags; > > ulong irq_stack_size; > > ulong *irq_stacks; > > + /* only needed for kaslr-enabled kernel */ > > + ulong kimage_voffset; > > + ulong kimage_text; > > + ulong kimage_end; > > }; > > Don't forget -- please display these in arm64_dump_machdep_table()! ;-) Yes, I will. > > > > struct arm64_stackframe { > > diff --git a/main.c b/main.c > > index 821bb4e..3f24908 100644 > > --- a/main.c > > +++ b/main.c > > @@ -227,9 +227,10 @@ main(int argc, char **argv) > > optarg); > > } > > } else if (STREQ(long_options[option_index].name, "kaslr")) { > > - if (!machine_type("X86_64")) > > - error(INFO, "--kaslr only valid " > > - "with X86_64 machine type.\n"); > > + if (!machine_type("X86_64") && > > + !machine_type("ARM64")) > > + error(INFO, "--kaslr not valid " > > + "with this machine type.\n"); > > else if (STREQ(optarg, "auto")) > > kt->flags2 |= (RELOC_AUTO|KASLR); > > else { > > diff --git a/symbols.c b/symbols.c > > index a8d3563..df27793 100644 > > --- a/symbols.c > > +++ b/symbols.c > > @@ -593,7 +593,8 @@ kaslr_init(void) > > { > > char *string; > > > > - if (!machine_type("X86_64") || (kt->flags & RELOC_SET)) > > + if ((!machine_type("X86_64") && !machine_type("ARM64")) || > > + (kt->flags & RELOC_SET)) > > return; > > > > /* > > @@ -712,7 +713,7 @@ store_symbols(bfd *abfd, int dynamic, void *minisyms, > > long symcount, > > if (machine_type("X86")) { > > if (!(kt->flags & RELOC_SET)) > > kt->flags |= RELOC_FORCE; > > - } else if (machine_type("X86_64")) { > > + } else if (machine_type("X86_64") || machine_type("ARM64")) { > > if ((kt->flags2 & RELOC_AUTO) && !(kt->flags & RELOC_SET)) > > derive_kaslr_offset(abfd, dynamic, from, > > fromend, size, store); > > @@ -783,7 +784,8 @@ store_sysmap_symbols(void) > > error(FATAL, "symbol table namespace malloc: %s\n", > > strerror(errno)); > > > > - if (!machine_type("X86") && !machine_type("X86_64")) > > + if (!machine_type("X86") && !machine_type("X86_64") && > > + !machine_type("ARM64")) > > kt->flags &= ~RELOC_SET; > > > > first = 0; > > @@ -4681,7 +4683,7 @@ value_search(ulong value, ulong *offset) > > if ((sp = machdep->value_to_symbol(value, offset))) > > return sp; > > > > - if (IS_VMALLOC_ADDR(value)) > > + if (IS_VMALLOC_ADDR(value)) > > goto check_modules; > > > > if ((sp = symval_hash_search(value)) == NULL) > > -- > > 2.8.1 > > > That's all I've got for this pass. Though I've not addressed all of your comments, I will sent out a updated patch tomorrow. Thanks, -Takahiro AKASHI > Thanks again, > Dave > > -- > Crash-utility mailing list > Crash-utility@xxxxxxxxxx > https://www.redhat.com/mailman/listinfo/crash-utility -- Thanks, -Takahiro AKASHI -- Crash-utility mailing list Crash-utility@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/crash-utility