Re: makedumpfile: ELF format issues (RE: makedumpfile: Fix divide by zero in print_report())

Dave Jones <davej@xxxxxxxxxxxxxxxxx> · Fri, 18 Oct 2019 00:36:24 -0400

On Thu, Oct 17, 2019 at 08:55:54PM +0000, Kazuhito Hagio wrote:

 > > I'll rework things so that it redirects to a file instead of dmesg, but
 > > it's going to take me a while to get that deployed and tested.
 > 
 > If your hosts have a big space enough, thare is another way that
 > you use cp for /proc/vmcore and use makedumpfile after reboot.
 > For example:
 > 
 >   # cp --sparse=always /proc/vmcore vmcore.cp
 >   reboot
 >   # makedumpfile -E -d 31 --message-level 31 --cyclic-buffer 4096 vmcore.cp dump.Ed31

I did try something like this (but without --sparse flag).
It took around 90 minutes to dump a 256GB core in my test, which isn't going
to be viable for our production hosts where I'm seeing the corruption
problems.

I've also been trying unsuccessfully to try and replicate it on an
isolated machine with similar specifications.

I'll give the sparse flag a try, though if memory is full enough to
panic-on-oom (Which seems to be one common trigger for this issue),
things might not be quite as sparse as I hope.

 > where the --cyclic-buffer option is needed to behave like in 2nd kernrel
 > on the one of your hosts:
 >   [   13.341818] Buffer size for the cyclic mode: 4194304
 > 
 > The captured vmcore.cp may be useful for trying a next patch first.

We had similar thoughts ;)

	Dave

_______________________________________________
kexec mailing list
kexec@xxxxxxxxxxxxxxxxxxx
http://lists.infradead.org/mailman/listinfo/kexec