On Wed, Sep 26, 2007 at 03:34:10PM +0800, Huang, Ying wrote: > Hi, > > I have a proposal to do crashdump without reserving memory during system > boot. The method is as follow: > > 1. Do not reserve memory during system boot, that is > crashkernel=<XX>@<YY> is not used in kernel command line. > > 2. A new kexec flag named KEXEC_CRASH_BY_NORMAL is defined for > sys_kexec_load system call. When this flag is specified, the > sys_kexec_load works as normal kexec (not crash kexec), except the > destination image is kexec_crash_image instead of kexec_image. > > 3. In kexec-tools (/sbin/kexec), --mem-min=<addr1> and --mem-max=<addr2> > is used to specify the memory area used by crashdump kernel. That is, > the image, elf core header, available memory of crashdump kernel is > within <addr1> ~ <addr2>. > Probably this can be an optional thing. Anyway if destination pages are going to be backed up in source pages, a user does not have to specify --mem-min and --mem-max. > 4. In kexec-tools, in addition to kernel image, elf core header, etc are > loaded, the available memory of crashdump kernel is loaded too. For > example, the segments for sys_kexec_load for crashdump kernel can be: > > --mem-min=0x100000 > --mem-max=0xffffff > > No. buf bufsz mem memsz > 0 NULL 0 0x1000 0x9e000 > 1 0x881fe88 0x289b 0x100000 0x3000 > 2 NULL 0 0x103000 0xfd000 > 3 0xb7bfa808 0xb7c00 0x200000 0xb8000 > 4 NULL 0 0x2b8000 0xd39000 > 5 0x8818d38 0x7120 0xff1000 0x9000 > 6 NULL 0 0xffa000 0x1000 > 7 0x8818268 0x400 0xffb000 0x4000 > 8 NULL 0 0xfff000 0x1000 > May be user also need to specify how much memory to allocate for second kernel execution. > 5. In relocate_kernel of Linux kernel, instead of copy the source page > to destination page, the contents of source page and the destination > page are swapped. (The destination page -> source page map is in > kexec_crash_image->head) The memory area used by crashdump kernel is > backupped to source page. > > Interesting. Just that it introduces more code in crash path. > In original crashdump implementation, the crashdump kernel run in > reserved memory area. The reserved memory pages are reserved memory > pages in primary (original) kernel. > > In this proposed implementation, the crashdump kernel run in specified > memory area, the contents of destination memory area is backupped before > crashdump kernel running. The backup pages are allocated memory pages in > primary (original) kernel. > How would you prepare ELF headers for backed up memory. ELF headers are created in user space and before sys_kexec_load is executed, kexec-tools need to know the address of physical memory where the actual data is. But in this scheme, source pages will be allocated only after sys_kexec_load has been called. These source page addresses will have to be exported to user space so that kexec tools can fill up ELF headers accordingly. > > The pros and cons of proposed implementation: > > Pros: > - The memory used by crashdump kernel need not to be reserved during > boot time. > - The memory used by crashdump kernel can be specified during > sys_kexec_load > - The memory used by crashdump kernel can be freed after unloading. > > Cons: > - The memory used by crashdump kernel can be the DMA destination, their > contents may be ruined by devices during the boot of crashdump kernel. > (Is it possible to turn off DMA for some memory area other than > reserving it?) Potential corruption because of DMA was a big issue and that's why the exclusive reserved area and relocatable kernel came into the picture. Eric in the past had tried disabling DMA at PCI level, but I think it did not work for him. - There is no gurantee that one will get sufficient memory allocated when needed. so loading kdump kernel might fail. - More code in crash path and potentially reduces the relibaility of the mechanism. > > > In fact, almost all mechanism for this proposal has been implemented by > my previous patch: "kexec jump" in "kexec based hibernation". > > > Any comment is welcome. > Idea is interesting. But at the same time it reduces the reliability of kdump. I am especially concerned about DMA issue more code in crash path. I will rather try to find out if I can create some mechanisms to do large contiguous memory area allocation from user space at run time instead of doing it at boot time. Thanks Vivek