Re: [PATCH] makedumpfile: cope with not-present mem section

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Feb 04, 2020 at 02:24:17PM +0800, piliu wrote:
> Hi,
> 
> Sorry to reply late due to a long festival.
> 
> I have tested this patch against v4.15 and latest kernel with small
> modification to meet the situation we discussed here. Both work fine.
> 
> The below is the modification of two kernel
> 
> test1. latest kernel with two extra modification to expose the problem
> -1.1 reverts commit 1f503443e7df8dc8366608b4d810ce2d6669827c
> (mm/sparse.c: reset section's mem_map when fully deactivated), this
> commit work around this bug
> -1.2. reverts commit a0b1280368d1e91ab72f849ef095b4f07a39bbf1 ("kdump:
> write correct address of mem_section into vmcoreinfo"). This will create
> a buggy situation as we discussed here.
> -1.3. fix building bug due to revert
> a0b1280368d1e91ab72f849ef095b4f07a39bbf1
> 
> test2. v4.15, which include both commit 83e3c48729d9 and a0b1280368d1.
> -2.1. revert commit a0b1280368d1e91ab72f849ef095b4f07a39bbf1 ("kdump:
> write correct address of mem_section into vmcoreinfo")
> 
> So I can not see any problem with my patch.
> Maybe I misunderstand the discussion, but I can not see my original
> patch will break the kernel which have 83e3c48729d9 but not a0b1280368d1.
> 
> Thanks,
> Pingfan
> 

You also need to test the case where 83e3c48729d9 is not present at all. Can
you test on a 4.4 kernel, for example? As far as I understand, a vanilla 4.4
kernel would not be dumpable with your patch.

Thanks.
Cascardo.

> On 01/29/2020 03:33 AM, Thadeu Lima de Souza Cascardo wrote:
> > On Tue, Jan 28, 2020 at 05:03:12PM +0000, HAGIO KAZUHITO(萩尾 一仁) wrote:
> >> Hi Cascardo,
> >>
> >>> -----Original Message-----
> >>> On Mon, Jan 27, 2020 at 02:04:54PM -0300, Thadeu Lima de Souza Cascardo wrote:
> >>>> Sorry for taking too long to respond, as I was on vacation.
> >>>>
> >>>> The kernels that had commit 83e3c48729d9, but not commit a0b1280368d1, are
> >>>> not supported anymore. In a way that it's even hard for me to test them.
> >>>>
> >>>> However, I managed to test it, and those two lines are definitively needed
> >>>> to dump a VM running such a kernel. Is removing them really needed to fix
> >>>> this issue?
> >>>>
> >>>> Otherwise, I would rather keep them.
> >>>>
> >>>> Thanks.
> >>>> Cascardo.
> >>>
> >>> By the way, I was too fast in sending this. We really need to keep those lines
> >>> as makedumpfile will fail to dump a 4.4 kernel with this patch as is.
> >>
> >> Is that Ubuntu 4.4 kernel which has 83e3c48729d9 and not a0b1280368d1?
> >> Could you elaborate on how it fails?
> > 
> > No, it doesn't have either, so my guess is it would fail on upstream 4.4 as
> > well, so anything that doesn't have 83e3c48729d9.
> > 
> > That's what I get on that 4.4 kernel (4.4.0-171-generic):
> > 
> > # ./makedumpfile /proc/vmcore ../dump
> > get_mem_section: Could not validate mem_section.
> > get_mm_sparsemem: Can't get the address of mem_section.
> > 
> > makedumpfile Failed.
> > #
> > 
> > So, now, I have a better grasp of the whole logic, and understand why it fails
> > with this patch.
> > 
> > So, we need to either interpret the mem_section as a pointer to the array of a
> > pointer to the pointer to the array. The only case the second option is valid
> > is when sparse_extreme is on, so we don't even need to check the second case
> > when it's off.
> > 
> > Then, we check that interpreting it either way is valid. If it's valid in both
> > interpretations, we can't decide which to use, and will fail. So far, we
> > haven't seen any case in the field where that would accidentally happen. But in
> > case it does, we should rather fail to dump and fallback to copying, than
> > creating a bogus compressed dump.
> > 
> > When this patch is applied, a kernel which does not have 83e3c48729d9, and
> > thus, has mem_section as a direct pointer to the array, it so happens that we
> > don't detect the pointer to pointer to the array case as invalid, thus failing
> > to dump.
> > 
> > The way we validate is that the mem_maps should either have the PRESENT bit or
> > be NULL. Now, that assumption may be invalid, and we would need to do the
> > validation some other way. I can test the cases where that assumption is
> > invalid in a 4.4 kernel and see how to fix this in a satisfactory way.
> > 
> > Going through the code once again, I don't see how the second section of the
> > patch would be correct by itself too. I will think it through a little more and
> > see if I can come up with a solution.
> > 
> > Regards.
> > Cascardo.
> > 
> >>
> >> I'm thinking that Pingfan's thought may help:
> >>>> I think it could be if/else, no need to call twice.
> >>
> >> Thanks,
> >> Kazu
> >>
> >>>
> >>> Cascardo.
> > 
> 

_______________________________________________
kexec mailing list
kexec@xxxxxxxxxxxxxxxxxxx
http://lists.infradead.org/mailman/listinfo/kexec




[Index of Archives]     [LM Sensors]     [Linux Sound]     [ALSA Users]     [ALSA Devel]     [Linux Audio Users]     [Linux Media]     [Kernel]     [Gimp]     [Yosemite News]     [Linux Media]

  Powered by Linux