Hi Bhupesh, I tried to get some log messages for the crash kernel, but unable to get anything. echo c > /proc/sysrq-trigger simply hangs w/o any messages on the console. Thanks, Anil -----Original Message----- From: Gurumurthy, Anil Sent: 29 November 2017 16:02 To: 'Bhupesh Sharma' <bhsharma at redhat.com> Cc: Dave Young <dyoung at redhat.com>; kexec at lists.infradead.org Subject: RE: kdump issues with 4.11 kernel -----Original Message----- From: Bhupesh Sharma [mailto:bhsharma@xxxxxxxxxx] Sent: 29 November 2017 15:50 To: Gurumurthy, Anil <Anil.Gurumurthy at cavium.com> Cc: Dave Young <dyoung at redhat.com>; kexec at lists.infradead.org Subject: Re: kdump issues with 4.11 kernel On Wed, Nov 29, 2017 at 3:36 PM, Gurumurthy, Anil <Anil.Gurumurthy at cavium.com> wrote: > > > -----Original Message----- > From: Bhupesh Sharma [mailto:bhsharma at redhat.com] > Sent: 29 November 2017 15:16 > To: Gurumurthy, Anil <Anil.Gurumurthy at cavium.com> > Cc: Dave Young <dyoung at redhat.com>; kexec at lists.infradead.org > Subject: Re: kdump issues with 4.11 kernel > > Hi Anil, > > On Wed, Nov 29, 2017 at 2:44 PM, Gurumurthy, Anil <Anil.Gurumurthy at cavium.com> wrote: >> Thanks. That did help getting kexec to work. >> However I still do not get a crash dump - echo c > >> /proc/sysrq-trigger does not get a crash dump. >> >> Any thoughts? > > Cam you share the console messages you see when the crash kernel > boots? Or do you see nothing after the crash is introduced via echo c >> /proc/sysrq-trigger > [Anil] I do not see any messages after introducing the crash. There could be several reasons for this: - crashkernel might be missing some arch/machine specific options. - It may be that the purgatory sha verification has failed. If your arch supports a console in purgatory then it is easy to debug this. - It might be that the crash kernel itself crashed very early. Pass some earlycon/earlyprintk option for your system to the second kernel command line. - Also please share relevant dmesg log of both primary kernel boot and the commands you use to invoke the crashkernel. [Anil] Thanks for the quick response This is what I have in the .config (for the primary kernel) CONFIG_EARLY_PRINTK=y CONFIG_EARLY_PRINTK_DBGP=y CONFIG_EARLY_PRINTK_EFI=y The log for the primary kernel boot: Nov 29 14:22:22 localhost journal: Runtime journal is using 8.0M (max 1.9G, leaving 2.9G of free 19.4G, current limit 1.9G). Nov 29 14:22:22 localhost kernel: Linux version 4.11.12+ (root at localhost.localdomain) (gcc version 4.8.3 20140911 (Red Hat 4.8.3-9) (GCC) ) #4 SMP Thu Nov 23 12:11:02 IST 2017 Nov 29 14:22:22 localhost kernel: Command line: BOOT_IMAGE=/vmlinuz-4.11.12+ root=/dev/mapper/rhel-root ro rd.lvm.lv=rhel/swap crashkernel=128M rd.lvm.lv=rhel/root rhgb quiet Nov 29 14:22:22 localhost kernel: x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers' Nov 29 14:22:22 localhost kernel: x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers' Nov 29 14:22:22 localhost kernel: x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers' Nov 29 14:22:22 localhost kernel: x86/fpu: xstate_offset[2]: 576, xstate_sizes[2]: 256 Nov 29 14:22:22 localhost kernel: x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'standard' format. Nov 29 14:22:22 localhost kernel: e820: BIOS-provided physical RAM map: Nov 29 14:22:22 localhost kernel: BIOS-e820: [mem 0x0000000000000000-0x000000000006bfff] usable Nov 29 14:22:22 localhost kernel: BIOS-e820: [mem 0x000000000006c000-0x000000000006cfff] ACPI NVS Nov 29 14:22:22 localhost kernel: BIOS-e820: [mem 0x000000000006d000-0x000000000009efff] usable Nov 29 14:22:22 localhost kernel: BIOS-e820: [mem 0x000000000009f000-0x000000000009ffff] ACPI NVS Nov 29 14:22:22 localhost kernel: BIOS-e820: [mem 0x0000000000100000-0x000000005d184fff] usable Nov 29 14:22:22 localhost kernel: BIOS-e820: [mem 0x000000005d185000-0x000000005d185fff] ACPI data Nov 29 14:22:22 localhost kernel: BIOS-e820: [mem 0x000000005d186000-0x000000005fb77fff] usable Nov 29 14:22:22 localhost kernel: BIOS-e820: [mem 0x000000005fb78000-0x000000005fb7cfff] reserved Nov 29 14:22:22 localhost kernel: BIOS-e820: [mem 0x000000005fb7d000-0x000000005ffdffff] usable Nov 29 14:22:22 localhost kernel: BIOS-e820: [mem 0x000000005ffe0000-0x000000005ffe1fff] ACPI data Nov 29 14:22:22 localhost kernel: BIOS-e820: [mem 0x000000005ffe2000-0x000000005fffafff] usable Nov 29 14:22:22 localhost kernel: BIOS-e820: [mem 0x000000005fffb000-0x0000000060001fff] ACPI data Nov 29 14:22:22 localhost kernel: BIOS-e820: [mem 0x0000000060002000-0x0000000060009fff] usable Nov 29 14:22:22 localhost kernel: BIOS-e820: [mem 0x000000006000a000-0x000000006000efff] ACPI data Nov 29 14:22:22 localhost kernel: BIOS-e820: [mem 0x000000006000f000-0x000000006000ffff] usable Nov 29 14:22:22 localhost kernel: BIOS-e820: [mem 0x0000000060010000-0x0000000060011fff] ACPI data Nov 29 14:22:22 localhost kernel: BIOS-e820: [mem 0x0000000060012000-0x00000000600ddfff] usable Will try to get the other details you needed too. -Anil Regards, Bhupesh > Generally, depending on your test machine arch, it is useful to use earlycon/earlyprintk to see if the crash kernel produced any useful message until the actual console device became operational. > > Can you try setting the earlycon/earlyprintk settings and share the crash kernel logs messages after the same? > > Thanks, > Bhupesh > >> -----Original Message----- >> From: Dave Young [mailto:dyoung at redhat.com] >> Sent: 29 November 2017 13:09 >> To: Gurumurthy, Anil <Anil.Gurumurthy at cavium.com> >> Cc: kexec at lists.infradead.org >> Subject: Re: kdump issues with 4.11 kernel >> >> Hi, >> On 11/29/17 at 05:29am, Gurumurthy, Anil wrote: >>> Hello, >>> I was facing trouble getting a crash dump on 4.11 kernel. Debugging a bit, I see that the kexec run from the cmd line fails. Any ideas on what I could be missing? >>> >>> [root at localhost ~]# kexec -p /boot/vmlinuz-`uname -r` >>> --initrd=/boot/initramfs-`uname -r`kdump.img ELF core (kcore) parse >>> failed Cannot load /boot/vmlinuz-4.11.12+ >>> >> >> Can you try below kexec-tools commit: >> commit ed15ba1b9977e506637ff1697821d97127b2c919 >> Author: Pratyush Anand <panand at redhat.com> >> Date: Wed Mar 1 11:19:42 2017 +0530 >> >> build_mem_phdrs(): check if p_paddr is invalid >> >> Thanks >> Dave >> >> _______________________________________________ >> kexec mailing list >> kexec at lists.infradead.org >> http://lists.infradead.org/mailman/listinfo/kexec