On Wed, May 25, 2016 at 10:28:10AM -0400, Dave Anderson wrote: > > ----- Original Message ----- > > On Tue, May 24, 2016 at 01:59:06PM -0400, Dave Anderson wrote: > > > > > > ----- Original Message ----- > > > > Yet some issues, but ... > > > > > > > > > > Hi Takahiro, > > > > > > Here are my general comments on my testing of the v2 patch, followed > > > by a few comments in the patch itself. > > > > Thank you for your quick review/testing, again. > > > > > First, the combination of the new memory map layout and KASLR is somewhat > > > confusing. I am testing your patch on a 4.6.0-0.rc7.git2.1.fc25 kernel > > > that has this configuration: > > > > Right. I use the "kaslr kernel" or "kaslr-enabled kernel" in two > > meanings (almost interchangeably): > > - kernel with a new memory map, that is v4.6 or later > > - kernel that has ability of KASLR (configured with CONFIG_RANDOMIZE_RAM) > > > > When we talked offline, you mentioned the possibility of backporting > > KASLR to older kernel, so I hesitated to use "v4.6 or later". > > But I think that we'd better use "v4.6 or later" for a clarification > > for now. > > Actually I was thinking more along the lines of the new vmemmap stuff > being backported. But it looks like that would be impossible without > also bringing along "kimage_voffset". Yeah, a sbustantial amount of code on mm code must be changed, and it is impractical to backport them. > > > > config-arm64:# CONFIG_RANDOMIZE_BASE is not set > > > > > > So KASLR doesn't really enter into the picture. But when bringing > > > up the crash session, it shows the "kaslr kernel" WARNING: > > > > > > # ./crash > > > > > > crash 7.1.5++ > > > Copyright (C) 2002-2016 Red Hat, Inc. > > > Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation > > > Copyright (C) 1999-2006 Hewlett-Packard Co > > > Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited > > > Copyright (C) 2006, 2007 VA Linux Systems Japan K.K. > > > Copyright (C) 2005, 2011 NEC Corporation > > > Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc. > > > Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc. > > > This program is free software, covered by the GNU General Public License, > > > and you are welcome to change it and/or distribute copies of it under > > > certain conditions. Enter "help copying" to see the conditions. > > > This program has absolutely no warranty. Enter "help warranty" for > > > details. > > > > > > WARNING: kimage_voffset not identified for kaslr kernel > > > GNU gdb (GDB) 7.6 > > > Copyright (C) 2013 Free Software Foundation, Inc. > > > License GPLv3+: GNU GPL version 3 or later > > > <http://gnu.org/licenses/gpl.html> > > > This is free software: you are free to change and redistribute it. > > > There is NO WARRANTY, to the extent permitted by law. Type "show > > > copying" > > > and "show warranty" for details. > > > This GDB was configured as "aarch64-unknown-linux-gnu"... > > > > > > KERNEL: > > > /usr/lib/debug/lib/modules/4.6.0-0.rc7.git2.1.fc25.aarch64/vmlinux > > > DUMPFILE: /dev/crash > > > CPUS: 8 > > > DATE: Tue May 24 10:08:08 2016 > > > UPTIME: 11 days, 18:32:41 > > > LOAD AVERAGE: 0.17, 0.09, 0.12 > > > TASKS: 197 > > > NODENAME: apm-mustang-ev3-36.khw.lab.eng.bos.redhat.com > > > RELEASE: 4.6.0-0.rc7.git2.1.fc25.aarch64 > > > VERSION: #1 SMP Thu May 12 13:28:43 UTC 2016 > > > MACHINE: aarch64 (unknown Mhz) > > > MEMORY: 16 GB > > > PID: 7556 > > > COMMAND: "crash" > > > TASK: fffffe00beb45400 [THREAD_INFO: fffffe00beb98000] > > > CPU: 7 > > > STATE: TASK_RUNNING (ACTIVE) > > > > > > crash> > > > > > > Why show that WARNING in this case? I understant that it's not > > > stashed during early initialization: > > > > If we don't know kimage_voffset, we can't analyze the contents of > > memory (live or kdump). > > I understand. But clearly that's not the case with my system above that > does not have CONFIG_RANDOMIZE_BASE configured, so the warning message > is nonsensical/confusing. That's why I was speculating about perhaps > having kdump export a VMCOREINFO_CONFIG(RANDOMIZE_BASE) so that it is > readily apparent. > > > > > > crash> help -m > > > flags: 104000c5 > > > (KSYMS_START|VM_L2_64K|VMEMMAP|IRQ_STACKS|MACHDEP_BT_TEXT|NEW_VMEMMAP) > > > ... [ cut ] ... > > > memory map layout: new <--- BTW, this is > > > redundant/not-a-member, and we've got NEW_VMEMMAP > > > VA_BITS: 42 > > > userspace_top: 0000040000000000 > > > page_offset: fffffe0000000000 > > > vmalloc_start_addr: fffffc0008000000 > > > vmalloc_end: fffffdff5ffeffff > > > modules_vaddr: fffffc0000000000 > > > modules_end: fffffc0007ffffff > > > vmemmap_vaddr: fffffdff80000000 > > > vmemmap_end: fffffdffffffffff > > > kimage_text: fffffc0008080000 > > > kimage_end: fffffc0009070000 > > > kimage_voffset: 0000000000000000 <-- available if kernel is > > > not randomized > > > phys_offset: 4000000000 > > > ... > > > > > > But it can be read: > > > > > > crash> px kimage_voffset > > > kimage_voffset = $1 = 0xfffffbc008000000 > > > crash> > > > > > > SO because it wasn't determined during session initialization, > > > it falls into the "no randomness" section of arm64_VTOP(): > > > > Exactly. > > > > > ulong > > > arm64_VTOP(ulong addr) > > > { > > > if (!(machdep->flags & NEW_VMEMMAP) || > > > (addr >= machdep->machspec->page_offset)) { > > > return machdep->machspec->phys_offset > > > + (addr - machdep->machspec->page_offset); > > > } else { > > > if (machdep->machspec->kimage_voffset) > > > return addr - machdep->machspec->kimage_voffset; > > > else /* no randomness */ > > > return machdep->machspec->phys_offset > > > + (addr - > > > machdep->machspec->vmalloc_start_addr); > > > } > > > } > > > > > > That works, but if "kimage_offset" can be read normally later on, perhaps it > > > should update "machdep->machspec->kimage_voffset" at that time? > > > > It is possible, but the check in arm64_VTOP() would be still necessary. > > Agreed... Keep it there. > > > > > There are some discrepancies with respect to the calculated addresses > > > and what is seen by looking at the reported memory map in the kernel log: > > > > > > [ 0.000000] Virtual kernel memory layout: > > > [ 0.000000] modules : 0xfffffc0000000000 - 0xfffffc0008000000 ( > > > 128 MB) > > > [ 0.000000] vmalloc : 0xfffffc0008000000 - 0xfffffdfedfff0000 ( > > > 2043 GB) > > > [ 0.000000] .text : 0xfffffc0008080000 - 0xfffffc0008890000 ( > > > 8256 KB) > > > .rodata : 0xfffffc0008890000 - 0xfffffc0008c10000 ( > > > 3584 KB) > > > .init : 0xfffffc0008c10000 - 0xfffffc0008d50000 ( > > > 1280 KB) > > > .data : 0xfffffc0008d50000 - 0xfffffc0008eaac00 ( > > > 1387 KB) > > > [ 0.000000] vmemmap : 0xfffffdfee0000000 - 0xfffffdffe0000000 ( > > > 4 GB maximum) > > > 0xfffffdfee0000000 - 0xfffffdfee1000000 ( > > > 16 MB actual) > > > [ 0.000000] fixed : 0xfffffdfffe7d0000 - 0xfffffdfffec00000 ( > > > 4288 KB) > > > [ 0.000000] PCI I/O : 0xfffffdfffee00000 - 0xfffffdffffe00000 ( > > > 16 MB) > > > [ 0.000000] memory : 0xfffffe0000000000 - 0xfffffe0400000000 ( > > > 16384 MB) > > > > > > Comparing it to the calculated values: > > > > > > VA_BITS: 42 > > > userspace_top: 0000040000000000 > > > page_offset: fffffe0000000000 OK > > > vmalloc_start_addr: fffffc0008000000 OK > > > vmalloc_end: fffffdff5ffeffff ? (seems wrong) > > > modules_vaddr: fffffc0000000000 OK > > > modules_end: fffffc0007ffffff OK (-1) > > > vmemmap_vaddr: fffffdff80000000 ? (seems wrong) > > > vmemmap_end: fffffdffffffffff ? (seems wrong) > > > kimage_text: fffffc0008080000 OK > > > kimage_end: fffffc0009070000 OK > > > kimage_voffset: 0000000000000000 > > > phys_offset: 4000000000 > > > > > > Starting with vmalloc_start_addr/vmalloc_end, if I run "kmem -v" to dump > > > the > > > vmalloc allocations, there are some allocations that are below the > > > 0xfffffc0008000000 start value shown in the log and by your calcuation. > > > I'm not clear on what that means?: > > > > > > crash> kmem -v > > > VMAP_AREA VM_STRUCT ADDRESS RANGE > > > SIZE > > > fffffe03daefd900 fffffe03daefd880 fffffc0000c10000 - fffffc0000c30000 > > > 131072 > > > fffffe00bf2c3a00 fffffe00bf2c3980 fffffc0000c90000 - fffffc0000cb0000 > > > 131072 > > > fffffe00bf2c7400 fffffe00bf2c7380 fffffc0000d50000 - fffffc0000d70000 > > > 131072 > > > fffffe03dc76aa00 fffffe03dc76a980 fffffc0000d90000 - fffffc0000db0000 > > > 131072 > > > fffffe03dc76be00 fffffe03dc76bd80 fffffc0000dd0000 - fffffc0000ee0000 > > > 1114112 > > > fffffe03d90e7900 fffffe03d90e7b80 fffffc0000f40000 - fffffc0000fb0000 > > > 458752 > > > fffffe00bec12f00 fffffe00bec12b00 fffffc0000fe0000 - fffffc0001000000 > > > 131072 > > > fffffe00bf3a1100 fffffe00bf3a1080 fffffc0001020000 - fffffc0001050000 > > > 196608 > > > fffffe00bf3a3280 fffffe00bf3a3200 fffffc0001070000 - fffffc0001090000 > > > 131072 > > > fffffe03ddf7fc00 fffffe03ddf7a880 fffffc0001230000 - fffffc0001250000 > > > 131072 > > > fffffe03fd508900 fffffe03fd508880 fffffc0008000000 - fffffc0008020000 > > > 131072 > > > fffffe03fd509580 fffffe03fd508d00 fffffc0008020000 - fffffc0008040000 > > > 131072 > > > fffffe03fd508a00 fffffe03fd508980 fffffc0008040000 - fffffc0008070000 > > > 196608 > > > ... [ cut ] ... > > > fffffe03d7235000 fffffe03d7233480 fffffc0014b30000 - fffffc0014b50000 > > > 131072 > > > fffffe00b921e700 fffffe00b921df00 fffffc0014b70000 - fffffc0014b90000 > > > 131072 > > > fffffe00b851eb00 fffffe00b8512900 fffffdfedfcf0000 - fffffdfedfe70000 > > > 1572864 > > > fffffe03dca4ef00 fffffe03dca4ef80 fffffdfedfe70000 - fffffdfedfff0000 > > > 1572864 > > > crash> > > > > > > Although the end value of fffffdfedfff0000 does match up with the value in > > > the > > > kernel log, your calculation is 2GB higher at fffffdff5ffeffff: > > > > > > crash> eval fffffdff5ffeffff - 0xfffffdfedfff0000 > > > hexadecimal: 7fffffff > > > decimal: 2147483647 > > > octal: 17777777777 > > > binary: > > > 0000000000000000000000000000000001111111111111111111111111111111 > > > crash> eval 7fffffff + 1 > > > hexadecimal: 80000000 (2GB) > > > decimal: 2147483648 > > > octal: 20000000000 > > > binary: > > > 0000000000000000000000000000000010000000000000000000000000000000 > > > crash> > > > > > > With respect to the vmemmap range, vmmemap address in the log file is the > > > one > > > that is displayed by "kmem -p": > > > > > > crash> kmem -p > > > PAGE PHYSICAL MAPPING INDEX CNT FLAGS > > > fffffdfee0000000 4000000000 fffffe03fd441950 2f 1 1002c > > > referenced,uptodate,lru,mappedtodisk > > > fffffdfee0000040 4000010000 fffffe03fd441950 30 1 1002c > > > referenced,uptodate,lru,mappedtodisk > > > fffffdfee0000080 4000020000 fffffe03fd441950 31 1 1002c > > > referenced,uptodate,lru,mappedtodisk > > > fffffdfee00000c0 4000030000 fffffe03fd441950 32 1 1002c > > > referenced,uptodate,lru,mappedtodisk > > > ... > > > fffffdfee0ffff00 43fffc0000 0 0 1 > > > 4000000000000400 reserved > > > fffffdfee0ffff40 43fffd0000 0 0 1 > > > 4000000000000400 reserved > > > fffffdfee0ffff80 43fffe0000 0 0 1 > > > 4000000000000400 reserved > > > fffffdfee0ffffc0 43ffff0000 0 0 1 > > > 4000000000000400 reserved > > > crash> > > > > > > And "kmem" alone recognizes it: > > > > > > crash> kmem fffffdfee0000000 > > > PAGE PHYSICAL MAPPING INDEX CNT FLAGS > > > fffffdfee0000000 4000000000 fffffe03fd441950 2f 1 1002c > > > referenced,uptodate,lru,mappedtodisk > > > crash> > > > > > > But your calculated vmemmap_vaddr above doesn't seem right: > > > > > > crash> kmem fffffdff80000000 > > > kmem: WARNING: cannot make virtual-to-physical translation: > > > fffffdff80000000 > > > fffffdff80000000: kernel virtual address not found in mem map > > > crash> > > > > > > Do your numbers match up on a dump that actually has CONFIG_RANDOMIZE_BASE? > > > > In my environment, the numbers above match up on a dump whether > > CONFIG_RANDOMIZE_BASE is enabled or not. > > > > Are you using a mainline kernel (final v4.6, not -rcX)? > > Good point. When I configured my arm64 test system, I installed the latest > Fedora kernel available at the time (4.6.0-0.rc7.git2.1.fc25), but it is based > upon linux-4.5.tar.xz. I see now that there is a kernel-4.6.0-1.fc25 available > that is based upon linux-4.6.tar.xz. I'll update the system. > > > > > Anyway, onto the patch... > > > > > > > > > > > changes in v2: > > > > * Fixed build warnings > > > > * Moved ARM64_NEW_VMEMMAP to machdep->flags > > > > * Show additional kaslr-related parameters in arm64_dump_machdep_table() > > > > * Handle a VMCOREINFO, "NUMBER(kimage_voffset)" > > > > > > > > ===8<=== > > > > >From 080a54ec232ac48ef2d8123cbedcf0f1fe27e391 Mon Sep 17 00:00:00 2001 > > > > From: AKASHI Takahiro <takahiro.akashi@xxxxxxxxxx> > > > > Date: Mon, 16 May 2016 17:31:55 +0900 > > > > Subject: [PATCH v2] arm64: fix kernel memory map handling for > > > > kaslr-enabled > > > > kernel > > > > > > > > In kernel v4.6, Kernel ASLR (KASLR) is supported on arm64, and the start > > > > address of the kernel image can be randomized if CONFIG_RANDOMIZE_BASE is > > > > enabled. > > > > Even worse, the kernel image is no more mapped in the linear mapping, but > > > > in vmalloc area (i.e. below PAGE_OFFSET). > > > > > > > > Now, according to the kernel's memory.h, converting a virtual address to > > > > a physical address should be done like below: > > > > > > > > phys_addr_t __x = (phys_addr_t)(x); \ > > > > __x & BIT(VA_BITS - 1) ? (__x & ~PAGE_OFFSET) + PHYS_OFFSET : \ > > > > (__x - kimage_voffset); }) > > > > > > > > Please note that PHYS_OFFSET is no more equal to the start address of > > > > the first usable memory block in SYSTEM RAM due to the fact mentioned > > > > above. > > > > > > > > This patch addresses this change and allows the crash utility to access > > > > memory contents with correct addresses. > > > > *However*, it is still rough-edged and we need more refinements. > > > > > > > > 1-1) On a live system, crash with this patch won't work because > > > > we currently have no way to know kimage_voffset. > > > > > > > > 1-2) For a core dump file, we can do simply: > > > > $ crash <vmlinux> <vmcore> > > > > as long as the file has "NUMBER(kimage_voffset)" > > > > (RELOC_AUTO|KASLR is automatically set.) > > > > > > > > I'm planning to add this enhancement in my next version of kexec/kdump > > > > patch, i.e. v17. > > > > > > > > 2) My current change breaks the backward compatibility. > > > > Crash won't work with v4.5 or early kernel partly because I changed > > > > the calculation of VA_BITS. (I think that the existing code is wrong.) > > > > So far I don't have enough time to look into this issue as I'm > > > > focusing > > > > on KASLR support. > > > > > > OK, but you still haven't shown how the current VA_BITS determination is > > > not a correct representation of CONFIG_ARM64_VA_BITS. > > > > At least, the current VA_BITS doesn't work with my calculations, which > > were derived directly from v4.6's "asm/memory.h". > > What are your kernel's PAGE_OFFSET value and CONFIG_ARM64_VA_BITS value? > > > > > > > 3) Backtracing a 'panic'ed task fails: > > > > crash> bt > > > > PID: 999 TASK: ffffffe960081800 CPU: 0 COMMAND: "sh" > > > > bt: WARNING: corrupt prstatus? pstate=0x80000000, but no user frame > > > > found > > > > bt: WARNING: cannot determine starting stack frame for task > > > > ffffffe960081800 > > > > > > > > frame->pc indicates that it is a kernel address, so obviously something > > > > is wrong. I'm diggin into it. > > > > > > Can you add a debug statement that displays frame->pc, frame->sp, and > > > frame->fp? > > > > I've already identified the cause. > > pt_regs->pstate was set mistakenly from "sprsr_el1" in panic(). > > But I have a difficulty here to fix the problem as we have no direct way > > to retrieve a value of the *current* PSTATE. > > Interesting. I don't know what you mean by "sprsr_el1" in panic(), but > I just got a report from my github "issues", where the user injected a panic() > call into a kernel module and got the same error as above because of the pstate > value. I asked him to put this hack into arm64.c: spspr_el1 is a register that holds a value of PSTATE when an exception is trapped into EL1. Since panic() is called even in case of software bugs, it is not useful anyway. Currently I'm trying to save a "faked" current PSTATE into pt_regs->pstate and will add this fix in my next kdump patch series (v17). > --- a/arm64.c > +++ b/arm64.c > @@ -1711,11 +1711,14 @@ arm64_get_dumpfile_stackframe(struct bt_info *bt, struct arm64_stackframe *frame > error(WARNING, > "corrupt prstatus? pstate=0x%lx, but no user frame found\n", > ptregs->pstate); > + if (INSTACK(frame->sp, bt) && INSTACK(frame->fp, bt)) > + goto try_kernel; > bt->flags |= BT_REGS_NOT_FOUND; > return FALSE; > } > bt->flags |= BT_USER_SPACE; > } else { > +try_kernel: > frame->sp = ptregs->sp; > frame->fp = ptregs->regs[29]; > } > > Can you give that a shot? Well, I think I've already fixed... > > > > > > > Signed-off-by: AKASHI Takahiro <takahiro.akashi@xxxxxxxxxx> > > > > --- > > > > arm64.c | 217 > > > > ++++++++++++++++++++++++++++++++++++++++++++++++-------------- > > > > defs.h | 24 +++++-- > > > > main.c | 7 +- > > > > symbols.c | 10 +-- > > > > 4 files changed, 196 insertions(+), 62 deletions(-) > > > > > > > > diff --git a/arm64.c b/arm64.c > > > > index 34c8c59..6b97093 100644 > > > > --- a/arm64.c > > > > +++ b/arm64.c > > > > @@ -72,6 +72,21 @@ static int arm64_get_crash_notes(void); > > > > static void arm64_calc_VA_BITS(void); > > > > static int arm64_is_uvaddr(ulong, struct task_context *); > > > > > > > > +ulong > > > > +arm64_VTOP(ulong addr) > > > > +{ > > > > + if (!(machdep->flags & NEW_VMEMMAP) || > > > > + (addr >= machdep->machspec->page_offset)) { > > > > + return machdep->machspec->phys_offset > > > > + + (addr - machdep->machspec->page_offset); > > > > + } else { > > > > + if (machdep->machspec->kimage_voffset) > > > > + return addr - machdep->machspec->kimage_voffset; > > > > + else /* no randomness */ > > > > + return machdep->machspec->phys_offset > > > > + + (addr - machdep->machspec->vmalloc_start_addr); > > > > + } > > > > +} > > > > > > > > /* > > > > * Do all necessary machine-specific setup here. This is called several > > > > times > > > > @@ -81,6 +96,7 @@ void > > > > arm64_init(int when) > > > > { > > > > ulong value; > > > > + char *string; > > > > struct machine_specific *ms; > > > > > > > > #if defined(__x86_64__) > > > > @@ -102,9 +118,34 @@ arm64_init(int when) > > > > if (machdep->cmdline_args[0]) > > > > arm64_parse_cmdline_args(); > > > > machdep->flags |= MACHDEP_BT_TEXT; > > > > + > > > > + ms = machdep->machspec; > > > > + if (!ms->kimage_voffset && > > > > + (string = pc->read_vmcoreinfo("NUMBER(kimage_voffset)"))) { > > > > + ms->kimage_voffset = htol(string, QUIET, NULL); > > > > + free(string); > > > > + } > > > > > > It doesn't make sense to call pc->read_vmcoreinfo() if !DUMPFILE(). > > > > Do you think so? > > By default, pc->read_vmcoreinfo is set to no_vmcoreinfo. > > So it doesn't hurt anything. > > Right, just a suggestion... So keep it there for now. > > > > + > > > > + if (ms->kimage_voffset) { > > > > + machdep->flags |= NEW_VMEMMAP; > > > > + > > > > + /* if not specified in command line */ > > > > + if (!kt->relocate && !(kt->flags2 & (RELOC_AUTO|KASLR))) > > > > + kt->flags2 |= (RELOC_AUTO|KASLR); > > > > + } > > > > + > > > > > > If CONFIG_RANDOMIZE_BASE is NOT set, should RELOC_AUTO|KASLR be set? They > > > don't get set on my live system. > > > > Not necessarily, but the current implementation of derive_kaslr_offset() > > just works even in this case. > > Ahah -- cool! kt->relocate will be set to 0 in this case, we will see no difference between KASLR and non-KASLR case. > > (This is the reason that I don't think we need a VMCOREINFO for > > CONFIG_RANDOMIZE_BASE.) > > Well, as long as there's a way to avoid that unnecessary/confusing > warning message for non-KASLR new-vmemmap kernels. > > I also wonder whether makedumpfile could use it? -> Pratyush? > > > > break; > > > > > > > > case PRE_GDB: > > > > + /* This check is somewhat redundant */ > > > > + if (kernel_symbol_exists("kimage_voffset")) { > > > > + machdep->flags |= NEW_VMEMMAP; > > > > + > > > > + if (!machdep->machspec->kimage_voffset) > > > > + error(WARNING, > > > > + "kimage_voffset not identified for kaslr kernel\n"); > > > > + } > > > > + > > > > > > As mentioned in the general discussion above, the WARNING above makes > > > no sense if CONFIG_RANDOMIZE_BASE is not configured. > > > > What about if I change the message "for v4.6 or later kernel"? > > But again, I do not want the message at all if it's not applicable. I removed this message. > Worse case, you could set a flag, and then if/when the crash session > ultimately craps out if it can't figure things out, then the message > could be displayed post-mortem. > > > > I'm now thinking that CONFIG_RANDOMIZE_BASE (and maybe others like KASAN?) > > > should be passed in the dumpfile with VMCOREINFO_CONFIG(). > > > > Regarding KASAN, we only need to check for an existence of "kasan_init" > > symbol as you recommended before :) > > Nice. > > > > > > > > if (!machdep->pagesize) { > > > > /* > > > > * Kerneldoc Documentation/arm64/booting.txt describes > > > > @@ -160,16 +201,35 @@ arm64_init(int when) > > > > machdep->pagemask = ~((ulonglong)machdep->pageoffset); > > > > > > > > arm64_calc_VA_BITS(); > > > > - machdep->machspec->page_offset = ARM64_PAGE_OFFSET; > > > > + ms = machdep->machspec; > > > > + ms->page_offset = ARM64_PAGE_OFFSET; > > > > + /* FIXME: idmap for NEW_VMEMMAP */ > > > > machdep->identity_map_base = ARM64_PAGE_OFFSET; > > > > - machdep->machspec->userspace_top = ARM64_USERSPACE_TOP; > > > > - machdep->machspec->modules_vaddr = ARM64_MODULES_VADDR; > > > > - machdep->machspec->modules_end = ARM64_MODULES_END; > > > > - machdep->machspec->vmalloc_start_addr = ARM64_VMALLOC_START; > > > > - machdep->machspec->vmalloc_end = ARM64_VMALLOC_END; > > > > - machdep->kvbase = ARM64_VMALLOC_START; > > > > - machdep->machspec->vmemmap_vaddr = ARM64_VMEMMAP_VADDR; > > > > - machdep->machspec->vmemmap_end = ARM64_VMEMMAP_END; > > > > + machdep->kvbase = ARM64_VA_START; > > > > + ms->userspace_top = ARM64_USERSPACE_TOP; > > > > + if (machdep->flags & NEW_VMEMMAP) { > > > > + struct syment *sp; > > > > + > > > > + sp = kernel_symbol_search("_text"); > > > > + ms->kimage_text = (sp ? sp->value : 0); > > > > + sp = kernel_symbol_search("_end"); > > > > + ms->kimage_end = (sp ? sp->value : 0); > > > > + > > > > + ms->modules_vaddr = ARM64_VA_START; > > > > + if (kernel_symbol_exists("kasan_init")) > > > > + ms->modules_vaddr += ARM64_KASAN_SHADOW_SIZE; > > > > + ms->modules_end = ms->modules_vaddr > > > > + + ARM64_MODULES_VSIZE -1; > > > > + > > > > + ms->vmalloc_start_addr = ms->modules_end + 1; > > > > + } else { > > > > + ms->modules_vaddr = ARM64_PAGE_OFFSET - MEGABYTES(64); > > > > + ms->modules_end = ARM64_PAGE_OFFSET - 1; > > > > + ms->vmalloc_start_addr = ARM64_VA_START; > > > > + } > > > > + ms->vmalloc_end = ARM64_VMALLOC_END; > > > > + ms->vmemmap_vaddr = ARM64_VMEMMAP_VADDR; > > > > + ms->vmemmap_end = ARM64_VMEMMAP_END; > > > > > > > > switch (machdep->pagesize) > > > > { > > > > @@ -232,8 +292,6 @@ arm64_init(int when) > > > > machdep->stacksize = ARM64_STACK_SIZE; > > > > machdep->flags |= VMEMMAP; > > > > > > > > - arm64_calc_phys_offset(); > > > > - > > > > machdep->uvtop = arm64_uvtop; > > > > machdep->kvtop = arm64_kvtop; > > > > machdep->is_kvaddr = generic_is_kvaddr; > > > > @@ -262,6 +320,10 @@ arm64_init(int when) > > > > machdep->dumpfile_init = NULL; > > > > machdep->verify_line_number = NULL; > > > > machdep->init_kernel_pgd = arm64_init_kernel_pgd; > > > > + > > > > + /* use machdep parameters */ > > > > + arm64_calc_phys_offset(); > > > > + > > > > > > This doesn't make any sense to me. I understand you've done it because > > > you are now doing a readmem() in arm64_calc_phys_offset(). But readmem() > > > calls are never done this early. And in fact, on the live system the > > > readmem() in arm64_calc_phys_offset() fails like so (running "crash -d4"): > > > > > > <readmem: fffffc0008d70bb0, KVADDR, "memstart_addr", 8, (ROE|Q), > > > 3ffcc2842c0> > > > <read_memory_device: addr: fffffc0008d70bb0 paddr: d70bb0 cnt: 8> > > > crash: read error: kernel virtual address: fffffc0008d70bb0 type: > > > "memstart_addr" > > > > > > It fails above because it needs the phys_base in order to do the readmem(). > > > During > > > runtime it has it: > > > > This is because we don't know kimage_voffset on a live system. > > (and so I mentioned this issue in my commit log.) > > Right -- which is a good reason for NOT using readmem() at that point > in time. I can also envision a scenario where readmem() may inadvertently > "work" (i.e. depending upon where phys_base really is), and then return > something bogus from an unintended location. > > > As far as kdump case is concerned, readmem() at this point just works > > for any address in kernel image. This is my intentional use of this function. > > I understand, but you're subverting the subsequent dumpfile-specific > calls later in the function. It is precisely *their* responsibility > to return the value of phys_offset. I think that the code should precede the other cases, including diskdump and kdump. The only exception would be a live sysmem with CONFIG_RANDOMIZE_RAM configured. I will temporarily add a check here like: arm64_calc_phys_offset() { if (machdep->flags & NEW_VMEMMAP) sp = kernel_symbol_search("memstart_addr"); readmem(sp->value, ...); ... } This way, even on a live system with no KASLR, "phys_offset" will be determined based on /proc/iomem, and eventually the crash should work. (Sorry, I haven't tested it yet though.) > > > crash> set debug 4 > > > debug: 4 > > > crash> rd memstart_addr > > > <addr: fffffc0008d70bb0 count: 1 flag: 490 (KVADDR)> > > > <readmem: fffffc0008d70bb0, KVADDR, "64-bit KVADDR", 8, (FOE), > > > 3ffcbfbfc28> > > > <read_memory_device: addr: fffffc0008d70bb0 paddr: 4000d70bb0 cnt: 8> > > > fffffc0008d70bb0: 0000004000000000 ....@... > > > crash> > > > > > > So without kimage_voffset, that readmem(), how would that readmem() > > > possibly work? > > > You need the *contents* of memstart_addr in order to *read* memstart_addr. > > > > > > Now, if it kimage_voffset were passed into the vmcore, then I guess you > > > could > > > read it. But why do it that way? That's what diskdump_get_phys_base() and > > > kdump_get_phys_base() are supposed to do. > > > > Because I believe that this is the *common* place to do it either for > > diskdump, or kdump ( or even a live system if we know kimage_voffset.) > > Well, as I have shown, it won't work on my live system because of the > chicken-and-egg scenario. Let's worry about live system analysis after > we get kdump straightened out. See above. > > > > > > > > break; > > > > > > > > case POST_GDB: > > > > @@ -409,6 +471,8 @@ arm64_dump_machdep_table(ulong arg) > > > > fprintf(fp, "%sIRQ_STACKS", others++ ? "|" : ""); > > > > if (machdep->flags & MACHDEP_BT_TEXT) > > > > fprintf(fp, "%sMACHDEP_BT_TEXT", others++ ? "|" : ""); > > > > + if (machdep->flags & NEW_VMEMMAP) > > > > + fprintf(fp, "%sNEW_VMEMMAP", others++ ? "|" : ""); > > > > fprintf(fp, ")\n"); > > > > > > > > fprintf(fp, " kvbase: %lx\n", machdep->kvbase); > > > > @@ -494,6 +558,8 @@ arm64_dump_machdep_table(ulong arg) > > > > ms = machdep->machspec; > > > > > > > > fprintf(fp, " machspec: %lx\n", (ulong)ms); > > > > + fprintf(fp, " memory map layout: %s\n", > > > > + (machdep->flags & NEW_VMEMMAP) ? "new" : "old"); > > > > > > As before, these internal "help" structure dumps just show existing > > > structure fields, > > > and NEW_VMMEMAP is already shown in the flags. > > > > I put it as useful information for those people who might have got > > any concerns about those odd numbers/kernel layout. > > But if you don't like it, I will remove it. I removed it. > > > > > > > fprintf(fp, " VA_BITS: %ld\n", ms->VA_BITS); > > > > fprintf(fp, " userspace_top: %016lx\n", ms->userspace_top); > > > > fprintf(fp, " page_offset: %016lx\n", ms->page_offset); > > > > @@ -503,6 +569,11 @@ arm64_dump_machdep_table(ulong arg) > > > > fprintf(fp, " modules_end: %016lx\n", ms->modules_end); > > > > fprintf(fp, " vmemmap_vaddr: %016lx\n", ms->vmemmap_vaddr); > > > > fprintf(fp, " vmemmap_end: %016lx\n", ms->vmemmap_end); > > > > + if (machdep->flags & NEW_VMEMMAP) { > > > > + fprintf(fp, " kimage_text: %016lx\n", ms->kimage_text); > > > > + fprintf(fp, " kimage_end: %016lx\n", ms->kimage_end); > > > > + fprintf(fp, " kimage_voffset: %016lx\n", ms->kimage_voffset); > > > > + } > > > > fprintf(fp, " phys_offset: %lx\n", ms->phys_offset); > > > > fprintf(fp, "__exception_text_start: %lx\n", > > > > ms->__exception_text_start); > > > > fprintf(fp, " __exception_text_end: %lx\n", ms->__exception_text_end); > > > > @@ -543,6 +614,42 @@ arm64_dump_machdep_table(ulong arg) > > > > } > > > > } > > > > > > > > +static int > > > > +arm64_parse_machdep_arg_l(char *argstring, char *param, ulong *value) > > > > +{ > > > > + int len; > > > > + int megabytes = FALSE; > > > > + char *p; > > > > + > > > > + len = strlen(param); > > > > + if (!STRNEQ(argstring, param) || (argstring[len] != '=')) > > > > + return FALSE; > > > > + > > > > + if ((LASTCHAR(argstring) == 'm') || > > > > + (LASTCHAR(argstring) == 'M')) { > > > > + LASTCHAR(argstring) = NULLCHAR; > > > > + megabytes = TRUE; > > > > + } > > > > + > > > > + p = argstring + len + 1; > > > > + if (strlen(p)) { > > > > + int flags = RETURN_ON_ERROR | QUIET; > > > > + int err = 0; > > > > + > > > > + if (megabytes) { > > > > + *value = dtol(p, flags, &err); > > > > + if (!err) > > > > + *value = MEGABYTES(*value); > > > > + } else { > > > > + *value = htol(p, flags, &err); > > > > + } > > > > + > > > > + if (!err) > > > > + return TRUE; > > > > + } > > > > + > > > > + return FALSE; > > > > +} > > > > > > > > /* > > > > * Parse machine dependent command line arguments. > > > > @@ -554,11 +661,10 @@ arm64_dump_machdep_table(ulong arg) > > > > static void > > > > arm64_parse_cmdline_args(void) > > > > { > > > > - int index, i, c, err; > > > > + int index, i, c; > > > > char *arglist[MAXARGS]; > > > > char buf[BUFSIZE]; > > > > char *p; > > > > - ulong value = 0; > > > > > > > > for (index = 0; index < MAX_MACHDEP_ARGS; index++) { > > > > if (!machdep->cmdline_args[index]) > > > > @@ -580,39 +686,23 @@ arm64_parse_cmdline_args(void) > > > > c = parse_line(buf, arglist); > > > > > > > > for (i = 0; i < c; i++) { > > > > - err = 0; > > > > - > > > > - if (STRNEQ(arglist[i], "phys_offset=")) { > > > > - int megabytes = FALSE; > > > > - int flags = RETURN_ON_ERROR | QUIET; > > > > - > > > > - if ((LASTCHAR(arglist[i]) == 'm') || > > > > - (LASTCHAR(arglist[i]) == 'M')) { > > > > - LASTCHAR(arglist[i]) = NULLCHAR; > > > > - megabytes = TRUE; > > > > - } > > > > - > > > > - p = arglist[i] + strlen("phys_offset="); > > > > - if (strlen(p)) { > > > > - if (megabytes) > > > > - value = dtol(p, flags, &err); > > > > - else > > > > - value = htol(p, flags, &err); > > > > - } > > > > - > > > > - if (!err) { > > > > - if (megabytes) > > > > - value = MEGABYTES(value); > > > > - > > > > - machdep->machspec->phys_offset = value; > > > > - > > > > - error(NOTE, > > > > - "setting phys_offset to: 0x%lx\n\n", > > > > - machdep->machspec->phys_offset); > > > > + if (arm64_parse_machdep_arg_l(arglist[i], > > > > + "phys_offset", > > > > + &machdep->machspec->phys_offset)) { > > > > + error(NOTE, > > > > + "setting phys_offset to: 0x%lx\n\n", > > > > + machdep->machspec->phys_offset); > > > > + > > > > + machdep->flags |= PHYS_OFFSET; > > > > + continue; > > > > + } else if (arm64_parse_machdep_arg_l(arglist[i], > > > > + "kimage_voffset", > > > > + &machdep->machspec->kimage_voffset)) { > > > > + error(NOTE, > > > > + "setting kimage_voffset to: 0x%lx\n\n", > > > > + machdep->machspec->kimage_voffset); > > > > > > > > - machdep->flags |= PHYS_OFFSET; > > > > - continue; > > > > - } > > > > + continue; > > > > } > > > > > > > > error(WARNING, "ignoring --machdep option: %s\n", > > > > @@ -627,10 +717,20 @@ arm64_calc_phys_offset(void) > > > > { > > > > struct machine_specific *ms = machdep->machspec; > > > > ulong phys_offset; > > > > + struct syment *sp; > > > > + ulong value; > > > > > > > > if (machdep->flags & PHYS_OFFSET) /* --machdep override */ > > > > return; > > > > > > > > + sp = kernel_symbol_search("memstart_addr"); > > > > + if (sp && readmem(sp->value, KVADDR, (char *)&value, sizeof(value), > > > > + "memstart_addr", QUIET|RETURN_ON_ERROR)) { > > > > + ms->phys_offset = value; > > > > + machdep->flags |= PHYS_OFFSET; > > > > + return; > > > > + } > > > > + > > > > > > Also, I may be missing something, but the PHYS_OFFSET flag means that it > > > was > > > passed in via --machdep, and is only used in this function to avoid any > > > further > > > attempts to figure out what it is. Here you are setting the flag for no > > > particular > > > reason that I can see. Why? > > > > Well, I don't remember very well. If not necessary, I will remove it. > > OK, I guarantee it's useless in this case... I removed it. Thanks, -Takahiro AKASHI > > > But again, doing a readmem() call this early is setting a bad precedent -- > > > they have > > > *always* been delayed until after gdb_session_init() in main_loop(). And > > > the work > > > of determining phys_base should be done by the dumpfile call-out functions > > > that > > > already exist. > > > > > > > > > > /* > > > > * Next determine suitable value for phys_offset. User can override > > > > this > > > > * by passing valid '--machdep phys_offset=<addr>' option. > > > > @@ -2377,6 +2477,11 @@ arm64_IS_VMALLOC_ADDR(ulong vaddr) > > > > { > > > > struct machine_specific *ms = machdep->machspec; > > > > > > > > + if ((machdep->flags & NEW_VMEMMAP) && > > > > + (vaddr >= machdep->machspec->kimage_text) && > > > > + (vaddr <= machdep->machspec->kimage_end)) > > > > + return FALSE; > > > > + > > > > return ((vaddr >= ms->vmalloc_start_addr && vaddr <= > > > > ms->vmalloc_end) || > > > > ((machdep->flags & VMEMMAP) && > > > > (vaddr >= ms->vmemmap_vaddr && vaddr <= > > > > ms->vmemmap_end)) > > > > || > > > > @@ -2407,7 +2512,11 @@ arm64_calc_VA_BITS(void) > > > > > > > > for (bitval = highest_bit_long(value); bitval; bitval--) { > > > > if ((value & (1UL << bitval)) == 0) { > > > > +#if 1 > > > > + machdep->machspec->VA_BITS = bitval + 1; > > > > +#else /* FIXME: bug? */ > > > > machdep->machspec->VA_BITS = bitval + 2; > > > > +#endif > > > > break; > > > > } > > > > } > > > > > > Again, a non-starter until the above is resolved -- but you know that > > > already... > > > > Just a reminder for now. > > > > > > > > > @@ -2459,10 +2568,22 @@ arm64_calc_virtual_memory_ranges(void) > > > > break; > > > > } > > > > > > > > - vmemmap_size = ALIGN((1UL << (ms->VA_BITS - machdep->pageshift)) * > > > > SIZE(page), PUD_SIZE); > > > > + if (machdep->flags & NEW_VMEMMAP) > > > > +#define STRUCT_PAGE_MAX_SHIFT 6 > > > > + vmemmap_size = 1UL << (ms->VA_BITS - machdep->pageshift - 1 > > > > + + STRUCT_PAGE_MAX_SHIFT); > > > > + else > > > > + vmemmap_size = ALIGN((1UL << (ms->VA_BITS - machdep->pageshift)) * > > > > SIZE(page), PUD_SIZE); > > > > + > > > > vmalloc_end = (ms->page_offset - PUD_SIZE - vmemmap_size - SZ_64K); > > > > - vmemmap_start = vmalloc_end + SZ_64K; > > > > - vmemmap_end = vmemmap_start + vmemmap_size; > > > > + > > > > + if (machdep->flags & NEW_VMEMMAP) { > > > > + vmemmap_start = ms->page_offset - vmemmap_size; > > > > + vmemmap_end = ms->page_offset; > > > > + } else { > > > > + vmemmap_start = vmalloc_end + SZ_64K; > > > > + vmemmap_end = vmemmap_start + vmemmap_size; > > > > + } > > > > > > > > ms->vmalloc_end = vmalloc_end - 1; > > > > ms->vmemmap_vaddr = vmemmap_start; > > > > diff --git a/defs.h b/defs.h > > > > index a09fa9a..2bbb55b 100644 > > > > --- a/defs.h > > > > +++ b/defs.h > > > > @@ -2844,8 +2844,8 @@ typedef u64 pte_t; > > > > > > > > #define PTOV(X) \ > > > > ((unsigned > > > > long)(X)-(machdep->machspec->phys_offset)+(machdep->machspec->page_offset)) > > > > -#define VTOP(X) \ > > > > - ((unsigned > > > > long)(X)-(machdep->machspec->page_offset)+(machdep->machspec->phys_offset)) > > > > + > > > > +#define VTOP(X) arm64_VTOP((ulong)(X)) > > > > > > > > #define USERSPACE_TOP (machdep->machspec->userspace_top) > > > > #define PAGE_OFFSET (machdep->machspec->page_offset) > > > > @@ -2938,18 +2938,23 @@ typedef signed int s32; > > > > #define VM_L3_4K (0x10) > > > > #define KDUMP_ENABLED (0x20) > > > > #define IRQ_STACKS (0x40) > > > > +#define NEW_VMEMMAP (0x80) > > > > > > > > /* > > > > * sources: Documentation/arm64/memory.txt > > > > * arch/arm64/include/asm/memory.h > > > > * arch/arm64/include/asm/pgtable.h > > > > */ > > > > - > > > > -#define ARM64_PAGE_OFFSET ((0xffffffffffffffffUL) << > > > > (machdep->machspec->VA_BITS - 1)) > > > > +#define ARM64_VA_START ((0xffffffffffffffffUL) \ > > > > + << machdep->machspec->VA_BITS) > > > > +#define ARM64_PAGE_OFFSET ((0xffffffffffffffffUL) \ > > > > + << (machdep->machspec->VA_BITS - 1)) > > > > #define ARM64_USERSPACE_TOP ((1UL) << machdep->machspec->VA_BITS) > > > > -#define ARM64_MODULES_VADDR (ARM64_PAGE_OFFSET - MEGABYTES(64)) > > > > -#define ARM64_MODULES_END (ARM64_PAGE_OFFSET - 1) > > > > -#define ARM64_VMALLOC_START ((0xffffffffffffffffUL) << > > > > machdep->machspec->VA_BITS) > > > > + > > > > +/* only used for v4.6 or later */ > > > > +#define ARM64_MODULES_VSIZE MEGABYTES(128) > > > > +#define ARM64_KASAN_SHADOW_SIZE (1UL << (machdep->machspec->VA_BITS - > > > > 3)) > > > > + > > > > /* > > > > * The following 3 definitions are the original values, but are obsolete > > > > * for 3.17 and later kernels because they are now build-time > > > > calculations. > > > > @@ -3028,6 +3033,10 @@ struct machine_specific { > > > > ulong kernel_flags; > > > > ulong irq_stack_size; > > > > ulong *irq_stacks; > > > > + /* only needed for kaslr-enabled kernel */ > > > > + ulong kimage_voffset; > > > > + ulong kimage_text; > > > > + ulong kimage_end; > > > > }; > > > > > > The comment is wrong -- the kimage_text and kimage_end values are needed > > > for > > > NEW_VMEMMAP kernels that are NOT configured with CONFIG_RANDOMIZE_BASE. > > > Unless you consider my live system "kaslr-enabled"? > > > > I will change it to "for v4.6 or later". > > > > I have limited (or no) time to investigate a live-system case, and so if > > you have any suggestions or your own patch, it is very much appreciated. > > Yep... but I will build upon your solid foundation. > > If I see anything interesting/different after my live system upgrade, I'll let > you know. > > Thanks, > Dave > > -- > Crash-utility mailing list > Crash-utility@xxxxxxxxxx > https://www.redhat.com/mailman/listinfo/crash-utility -- Thanks, -Takahiro AKASHI -- Crash-utility mailing list Crash-utility@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/crash-utility