From: Cliff Wickman <cpw@xxxxxxx> Gentlemen of kexec, I have been working on enabling kdump on some very large systems, and have found some solutions that I hope you will consider. The first issue is to work within the restricted size of crashkernel memory under 2.6.32-based kernels, such as sles11 and rhel6. The second issue is to reduce the very large size of a dump of a big memory system, even on an idle system. These are my propositions: Size of crashkernel memory 1) raw i/o for writing the dump 2) use root device for the bitmap file (not tmpfs) 3) raw i/o for reading/writing the bitmaps Size of dump (and hence the duration of dumping) 4) exclude page structures for unused pages 1) Is quite easy. The cache of pages needs to be aligned on a block boundary and written in block multiples, as required by O_DIRECT files. The use of raw i/o prevents the growing of the crash kernel's page cache. 2) Is also quite easy. My patch finds the path to the crash kernel's root device by examining the dump pathname. Storing the bitmaps to a file is otherwise not conserving memory, as they are being written to tmpfs. 3) Raw i/o for the bitmaps, is accomplished by caching the bitmap file in a similar way to that of the dump file. I find that the use of direct i/o is not significantly slower than writing through the kernel's page cache. 4) The excluding of unused kernel page structures is very important for a large memory system. The kernel otherwise includes 3.67 million pages of page structures per TB of memory. By contrast the rest of the kernel is only about 1 million pages. Test results are below, for systems of 1TB, 2TB, 8.8TB and 16TB. (There are no 'old' numbers for 16TB as time and space requirements made those effectively useless.) Run times were generally reduced 2-3x, and dump size reduced about 8x. All timings were done using 512M of crashkernel memory. System memory size 1TB unpatched patched OS: rhel6.4 (does a free pages pass) page scan time 1.6min 1.6min dump copy time 2.4min .4min total time 4.1min 2.0min dump size 3014M 364M OS: rhel6.5 page scan time .6min .6min dump copy time 2.3min .5min total time 2.9min 1.1min dump size 3011M 423M OS: sles11sp3 (3.0.93) page scan time .5min .5min dump copy time 2.3min .5min total time 2.8min 1.0min dump size 2950M 350M 2TB OS: rhel6.5 (cyclicx3) page scan time 2.0min 1.8min dump copy time 8.0min 1.5min total time 10.0min 3.3min dump size 6141M 835M 8.8TB OS: rhel6.5 (cyclicx5) page scan time 6.6min 5.5min dump copy time 67.8min 6.2min total time 74.4min 11.7min dump size 15.8G 2.7G 16TB OS: rhel6.4 page scan time 125.3min dump copy time 13.2min total time 138.5min dump size 4.0G OS: rhel6.5 page scan time 27.8min dump copy time 13.3min total time 41.1min dump size 4.1G Page scan time is greatly affected by whether or not the kernel supports mmap of /proc/vmcore. The choice of snappy vs. zlib compression becomes fairly irrelevant when we can shrink the dump size dramatically. The above were done with snappy compression. I am sending my 2 working patches. They are kludgy in the sense that they ignore all forms of kdump except the creation of a disk dump, and all architectures except x86_64. But I think they are sufficient to demonstrate the sizable time, crashkernel space and disk space savings that are possible.