On 2023/05/10 14:02, HAGIO KAZUHITO(萩尾 一仁) wrote: > On 2023/05/10 10:44, HAGIO KAZUHITO(萩尾 一仁) wrote: >> On 2023/05/10 4:33, Luiz Capitulino wrote: >>> On 2023-05-09 03:32, HAGIO KAZUHITO(萩尾 一仁) wrote: >>>> On 2023/05/02 3:41, Luiz Capitulino wrote: >>>>> Hi all, >>>>> >>>>> I'm trying to run latest crash (HEAD 2505a65ff54) against kernel >>>>> 4.14.314 but I'm getting the error below on startup. >>>>> >>>>> Is this a known issue? If not, any suggestions on how to debug it? >>>> >>>> hmm, I tried the kernel version, but could not reproduce it. >>>> >>>> crash> sys >>>> KERNEL: /lib/modules/4.14.314/build/vmlinux >>>> DUMPFILE: /proc/kcore >>>> CPUS: 4 >>>> DATE: Tue May 9 16:16:14 JST 2023 >>>> UPTIME: 00:07:02 >>>> LOAD AVERAGE: 0.07, 0.12, 0.07 >>>> TASKS: 174 >>>> NODENAME: rhel78b >>>> RELEASE: 4.14.314 >>>> VERSION: #1 SMP Tue May 9 15:28:59 JST 2023 >>>> MACHINE: x86_64 (3408 Mhz) >>>> MEMORY: 4 GB >>>> >>>> Could you upload a startup log with "crash -d 8" option? >>> >>> I'm attaching a file with this information, thanks a lot for looking >>> into this. >> >> Thanks. >> >> ----- >> module: ffffffffa00f8f80 >> <readmem: ffffffffa00f8f80, KVADDR, "module struct", 896, (ROE|Q), 122f800> >> <readmem: 200e000, PHYSADDR, "pud page", 4096, (FOE), 1c95e00> >> <read_proc_kcore: addr: 200e000 paddr: 200e000 cnt: 4096> >> crash: seek error: physical address: 200e000 type: "pud page" >> ----- >> >> It seems that the virt to phys conversion for ffffffffa00f8f80 fails >> because the file offset of the pud page is not found in /proc/kcore. >> >> According to read_proc_kcore(), it does >> 1. p2v for 200e000 i.e. phys:200e000 --> virt:??? >> 2. search /proc/kcore pt_loads for the corresponding file offset to the >> virtual address. (as pc->curcmd_flags does not have MEMTYPE_KVADDR.) >> 3. read the file offset. >> >> so, what is the converted virtual address? For example, >> >> --- a/netdump.c >> +++ b/netdump.c >> @@ -4362,6 +4362,8 @@ read_proc_kcore(int fd, void *bufptr, int cnt, ulong addr, physaddr_t paddr) >> else >> kvaddr = PTOV((ulong)paddr); >> >> + fprintf(fp, "kvaddr: %lx\n", kvaddr); >> + >> offset = UNINITIALIZED; >> readcnt = cnt; >> > > Ah, probably got it. > > The PTOV() above is defined like this: > > #define PTOV(X) ((unsigned long)(X)+(machdep->kvbase)) > >> >> Your kernel has the following pt_load information, probably it's out of >> these vaddr ranges? >> >> offset vaddr end paddr end size >> 7fffff604000 ffffffffff600000-ffffffffff601000 ffffffffffffffff- 0 (1000) >> 7fff81004000 ffffffff81000000-ffffffff8377f000 1000000- 377f000 (277f000) >> 490000004000 ffffc90000000000-ffffe90000000000 ffffffffffffffff- 0 (1fffffffffff) >> 7fffa0004000 ffffffffa0000000-ffffffffff000000 ffffffffffffffff- 0 (5f000000) >> 88000005000 ffff888000001000-ffff88800009f000 1000- 9f000 (9e000) >> 6a0000004000 ffffea0000000000-ffffea0000003000 ffffffffffffffff- 0 (3000) >> 88000104000 ffff888000100000-ffff8880bffe8000 100000- bffe8000 (bfee8000) >> 6a0000008000 ffffea0000004000-ffffea0003000000 ffffffffffffffff- 0 (2ffc000) >> 88100004000 ffff888100000000-ffff888fff000000 100000000-fff000000 (eff000000) >> 6a0004004000 ffffea0004000000-ffffea003ffc0000 ffffffffffffffff- 0 (3bfc0000) > > Your kernel looks configured without CONFIG_RANDOMIZE_BASE. For such > kernels, a hard-coded value is used for PAGE_OFFSET and kvbase. And > I found that Linux 4.14.84 and later has the recent PAGE_OFFSET. > > case POST_GDB: > if (!(machdep->flags & RANDOMIZED) && > ((THIS_KERNEL_VERSION >= LINUX(4,19,5)) || > ((THIS_KERNEL_VERSION >= LINUX(4,14,84)) && > (THIS_KERNEL_VERSION < LINUX(4,15,0))))) { > machdep->machspec->page_offset = machdep->flags & VM_5LEVEL ? > PAGE_OFFSET_5LEVEL_4_20 : PAGE_OFFSET_4LEVEL_4_20; > machdep->kvbase = machdep->machspec->page_offset; > > #define PAGE_OFFSET_4LEVEL_4_20 0xffff888000000000 > > But, the THIS_KERNEL_VERSION and LINUX() macros are defined like this: > > #define THIS_KERNEL_VERSION ((kt->kernel_version[0] << 16) + \ > (kt->kernel_version[1] << 8) + \ > (kt->kernel_version[2])) > #define LINUX(x,y,z) (((uint)(x) << 16) + ((uint)(y) << 8) + (uint)(z)) > > So (THIS_KERNEL_VERSION < LINUX(4,15,0)) is false on Linux 4.14.256 and > later, and the old PAGE_OFFSET will be used. > > So does this patch work well? I also confirmed that the issue could be reproduced without CONFIG_RANDOMIZE_BASE, and this patch fixed it. so posted a formal patch, please try that. Thanks, Kazu > > --- a/defs.h > +++ b/defs.h > @@ -807,10 +807,10 @@ struct kernel_table { /* kernel data */ > } \ > } > > -#define THIS_KERNEL_VERSION ((kt->kernel_version[0] << 16) + \ > - (kt->kernel_version[1] << 8) + \ > +#define THIS_KERNEL_VERSION ((kt->kernel_version[0] << 24) + \ > + (kt->kernel_version[1] << 16) + \ > (kt->kernel_version[2])) > -#define LINUX(x,y,z) (((uint)(x) << 16) + ((uint)(y) << 8) + (uint)(z)) > +#define LINUX(x,y,z) (((uint)(x) << 24) + ((uint)(y) << 16) + (uint)(z)) > > #define THIS_GCC_VERSION ((kt->gcc_version[0] << 16) + \ > (kt->gcc_version[1] << 8) + \ > > Thanks, > Kazu > -- > Crash-utility mailing list > Crash-utility@xxxxxxxxxx > https://listman.redhat.com/mailman/listinfo/crash-utility > Contribution Guidelines: https://github.com/crash-utility/crash/wiki -- Crash-utility mailing list Crash-utility@xxxxxxxxxx https://listman.redhat.com/mailman/listinfo/crash-utility Contribution Guidelines: https://github.com/crash-utility/crash/wiki