Re: [PATCH] makedumpfile: cope with not-present mem section

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 02/20/2020 04:12 AM, HAGIO KAZUHITO(萩尾 一仁) wrote:
> Hi Cascardo,
> 
> Do you have any solution or detailed information on the failure on your kernel?
> or could you try this branch?  It has an additional patch on top of Pingfan's
> one to avoid the false positive failure that I'm suspecting:
> https://github.com/k-hagio/makedumpfile/tree/modify-mem_section-validation
> 
> Pingfan,
> Do you have an output of makedumpfile when the original failure occurs?
> If you don't and it's hard to get it, no need to do so.  I just would like to
> add it to your patch if available.
I did the test on a PowerVM. After hot removing the memory, save a raw
vmcore by "cp", then run makedumpfile against the "cp" vmcore, and it
will get the following error message:
# makedumpfile -x vmlinux -l -d 31 vmcore vmcore.dump
get_mem_section: Could not validate mem_section.
get_mm_sparsemem: Can't get the address of mem_section.

makedumpfile Failed.

Thanks,
Pingfan
> 
> Thanks,
> Kazu
> 
> -----Original Message-----
>> On 02/12/2020 12:11 PM, piliu wrote:
>>>
>>>
>>> On 02/06/2020 11:46 AM, piliu wrote:
>>>>
>>>>
>>>> On 02/05/2020 05:18 AM, HAGIO KAZUHITO wrote:
>>>>>> -----Original Message-----
>>>>>> On Tue, Feb 04, 2020 at 02:24:17PM +0800, piliu wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> Sorry to reply late due to a long festival.
>>>>>>>
>>>>>>> I have tested this patch against v4.15 and latest kernel with small
>>>>>>> modification to meet the situation we discussed here. Both work fine.
>>>>>>>
>>>>>>> The below is the modification of two kernel
>>>>>>>
>>>>>>> test1. latest kernel with two extra modification to expose the problem
>>>>>>> -1.1 reverts commit 1f503443e7df8dc8366608b4d810ce2d6669827c
>>>>>>> (mm/sparse.c: reset section's mem_map when fully deactivated), this
>>>>>>> commit work around this bug
>>>>>>> -1.2. reverts commit a0b1280368d1e91ab72f849ef095b4f07a39bbf1 ("kdump:
>>>>>>> write correct address of mem_section into vmcoreinfo"). This will create
>>>>>>> a buggy situation as we discussed here.
>>>>>>> -1.3. fix building bug due to revert
>>>>>>> a0b1280368d1e91ab72f849ef095b4f07a39bbf1
>>>>>>>
>>>>>>> test2. v4.15, which include both commit 83e3c48729d9 and a0b1280368d1.
>>>>>>> -2.1. revert commit a0b1280368d1e91ab72f849ef095b4f07a39bbf1 ("kdump:
>>>>>>> write correct address of mem_section into vmcoreinfo")
>>>>>>>
>>>>>>> So I can not see any problem with my patch.
>>>>>>> Maybe I misunderstand the discussion, but I can not see my original
>>>>>>> patch will break the kernel which have 83e3c48729d9 but not a0b1280368d1.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Pingfan
>>>>>>>
>>>>>>
>>>>>> You also need to test the case where 83e3c48729d9 is not present at all. Can
>>>>>> you test on a 4.4 kernel, for example? As far as I understand, a vanilla 4.4
>>>>>> kernel would not be dumpable with your patch.
>>>>>
>>>>> As far as I've tested this patch with SPARSEMEM_EXTREME vmcores below, it's OK:
>>>>>   - 51 vmcores of vanilla kernels (each from 2.6.36 through 5.5) on hand
>>>>>   - one more vanilla 4.4.0 kernel with a different config from the above
>>>>>
>>>>> So apparently not all vanilla 4.4 kernels are affected by the patch.
>>>>>
>>>> Sorry, due to touch hardware resource in our lab, I can not have a test
>>>> on v4.4 on a system with hotplug memory yet. I still try to find some
>>>> resource.
>>>>
>>>> But from the logic of this patch, it just does the following changes:
>>>> -1. After memory hot-removed, either mem_section.section_mem_map = NULL
>>>> or mem_section.section_mem_map without SECTION_MARKED_PRESENT, we will
>>>> have mem_maps[section_nr] = mem_map = NOT_MEMMAP_ADDR, so later it will
>>>> be skipped.
>>>> -2. If not populated, mem_section.section_mem_map = NULL. It can follow
>>>> the same handling of hot-removed, and be skipped during parsing.
>>>>
>>>> And in v4.4 sparse_remove_one_section() just assigns ms->section_mem_map
>>>> = 0, which can not be violated by the above changes.
>> Ping. As all of us can not reproduce this bug by v4.4 kernel, further
>> more, there is no code analysis, which persuades us this patch will
>> break the makedumpfile on any kernel version.
>>
>> Could this better-to-have patch be accepted?
>>
>> Thanks,
>> Pingfan
>>> Last night, I got a machine to test this scene. After applying my patch
>>> makedumpfile can still work with v4.4 kernel.
>>>
>>> Thanks,
>>> Pingfan
>>>
> 


_______________________________________________
kexec mailing list
kexec@xxxxxxxxxxxxxxxxxxx
http://lists.infradead.org/mailman/listinfo/kexec




[Index of Archives]     [LM Sensors]     [Linux Sound]     [ALSA Users]     [ALSA Devel]     [Linux Audio Users]     [Linux Media]     [Kernel]     [Gimp]     [Yosemite News]     [Linux Media]

  Powered by Linux