On Tue, Jan 28, 2020 at 05:03:12PM +0000, HAGIO KAZUHITO(萩尾 一仁) wrote: > Hi Cascardo, > > > -----Original Message----- > > On Mon, Jan 27, 2020 at 02:04:54PM -0300, Thadeu Lima de Souza Cascardo wrote: > > > Sorry for taking too long to respond, as I was on vacation. > > > > > > The kernels that had commit 83e3c48729d9, but not commit a0b1280368d1, are > > > not supported anymore. In a way that it's even hard for me to test them. > > > > > > However, I managed to test it, and those two lines are definitively needed > > > to dump a VM running such a kernel. Is removing them really needed to fix > > > this issue? > > > > > > Otherwise, I would rather keep them. > > > > > > Thanks. > > > Cascardo. > > > > By the way, I was too fast in sending this. We really need to keep those lines > > as makedumpfile will fail to dump a 4.4 kernel with this patch as is. > > Is that Ubuntu 4.4 kernel which has 83e3c48729d9 and not a0b1280368d1? > Could you elaborate on how it fails? No, it doesn't have either, so my guess is it would fail on upstream 4.4 as well, so anything that doesn't have 83e3c48729d9. That's what I get on that 4.4 kernel (4.4.0-171-generic): # ./makedumpfile /proc/vmcore ../dump get_mem_section: Could not validate mem_section. get_mm_sparsemem: Can't get the address of mem_section. makedumpfile Failed. # So, now, I have a better grasp of the whole logic, and understand why it fails with this patch. So, we need to either interpret the mem_section as a pointer to the array of a pointer to the pointer to the array. The only case the second option is valid is when sparse_extreme is on, so we don't even need to check the second case when it's off. Then, we check that interpreting it either way is valid. If it's valid in both interpretations, we can't decide which to use, and will fail. So far, we haven't seen any case in the field where that would accidentally happen. But in case it does, we should rather fail to dump and fallback to copying, than creating a bogus compressed dump. When this patch is applied, a kernel which does not have 83e3c48729d9, and thus, has mem_section as a direct pointer to the array, it so happens that we don't detect the pointer to pointer to the array case as invalid, thus failing to dump. The way we validate is that the mem_maps should either have the PRESENT bit or be NULL. Now, that assumption may be invalid, and we would need to do the validation some other way. I can test the cases where that assumption is invalid in a 4.4 kernel and see how to fix this in a satisfactory way. Going through the code once again, I don't see how the second section of the patch would be correct by itself too. I will think it through a little more and see if I can come up with a solution. Regards. Cascardo. > > I'm thinking that Pingfan's thought may help: > >> I think it could be if/else, no need to call twice. > > Thanks, > Kazu > > > > > Cascardo. _______________________________________________ kexec mailing list kexec@xxxxxxxxxxxxxxxxxxx http://lists.infradead.org/mailman/listinfo/kexec