On 27.09.19 13:35, Aneesh Kumar K.V wrote: > On 9/27/19 4:10 PM, David Hildenbrand wrote: >> On 27.09.19 12:36, Aneesh Kumar K.V wrote: >>> On 9/27/19 1:16 PM, David Hildenbrand wrote: >>>> On 27.09.19 03:51, Aneesh Kumar K.V wrote: >>>>> On 9/27/19 4:15 AM, Andrew Morton wrote: >>>>>> On Thu, 26 Sep 2019 17:55:51 +0530 "Aneesh Kumar K.V" <aneesh.kumar@xxxxxxxxxxxxx> wrote: >>>>>> >>>>>>> With altmap, all the resource pfns are not initialized. While initializing >>>>>>> pfn, altmap reserve space is skipped. Hence when removing pfn from zone skip >>>>>>> pfns that were never initialized. >>>>>>> >>>>>>> Update memunmap_pages to calculate start and end pfn based on altmap >>>>>>> values. This fixes a kernel crash that is observed when destroying namespace. >>>>>>> >>>>>>> [ 74.745056] BUG: Unable to handle kernel data access at 0xc00c000001400000 >>>>>>> [ 74.745256] Faulting instruction address: 0xc0000000000b58b0 >>>>>>> cpu 0x2: Vector: 300 (Data Access) at [c00000026ea93580] >>>>>>> pc: c0000000000b58b0: memset+0x68/0x104 >>>>>>> lr: c0000000003eb008: page_init_poison+0x38/0x50 >>>>>>> ... >>>>>>> current = 0xc000000271c67d80 >>>>>>> paca = 0xc00000003fffd680 irqmask: 0x03 irq_happened: 0x01 >>>>>>> pid = 3665, comm = ndctl >>>>>>> [link register ] c0000000003eb008 page_init_poison+0x38/0x50 >>>>>>> [c00000026ea93830] c0000000004754d4 remove_pfn_range_from_zone+0x64/0x3e0 >>>>>>> [c00000026ea938a0] c0000000004b8a60 memunmap_pages+0x300/0x400 >>>>>>> [c00000026ea93930] c0000000009e32a0 devm_action_release+0x30/0x50 >>>>>> >>>>>> Doesn't apply to mainline or -next. Which tree is this against? >>>>>> >>>>> >>>>> After applying the patches from David on mainline. That is the reason I >>>>> replied to this thread. I should have mentioned in the email that it is >>>>> based on patch series "[PATCH v4 0/8] mm/memory_hotplug: Shrink zones >>>>> before removing memory" >>>> >>>> So if I am not wrong, my patch "[PATCH v4 4/8] mm/memory_hotplug: Poison >>>> memmap in remove_pfn_range_from_zone()" makes it show up that we >>>> actually call _remove_pages() with wrong parameters, right? >>>> >>>> If so, I guess it would be better for you to fix it before my series and >>>> I will rebase my series on top of that. >>>> >>> >>> I posted a patch that can be applied to mainline. I sent that as a reply >>> to this email. Can you include that and PATCH 2 as first two patches in >>> your series? That should help to locate the full patch series needed >>> for fixing the kernel crash. >> >> I can drag these along, unless Andrew wants to pick them up right away >> (or we're waiting for more feedback). > > Considering this patch alone won't fix the issue, It would be nice if we > could club them with rest of the changes. > I'll drag them along, adding Pankaj's RB's. If they get picked up independently, fine :) -- Thanks, David / dhildenb