On Sat, Sep 24, 2022 at 08:27:35PM +0800, Miaohe Lin wrote: > On 2022/9/23 22:12, Naoya Horiguchi wrote: > > There seems another build error in aarch64 with MEMORY_HOTPLUG disabled. > > https://lore.kernel.org/lkml/20220923110144.GA1413812@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx/ > > , so let me revise this patch again to handle it. > > > > - Naoya Horiguchi > > > > --- > > From: Naoya Horiguchi <naoya.horiguchi@xxxxxxx> > > Date: Fri, 23 Sep 2022 22:51:20 +0900 > > Subject: [PATCH v5 4/4] mm/hwpoison: introduce per-memory_block hwpoison counter > > > > Currently PageHWPoison flag does not behave well when experiencing memory > > hotremove/hotplug. Any data field in struct page is unreliable when the > > associated memory is offlined, and the current mechanism can't tell whether > > a memory section is onlined because a new memory devices is installed or > > because previous failed offline operations are undone. Especially if > > there's a hwpoisoned memory, it's unclear what the best option is. > > > > So introduce a new mechanism to make struct memory_block remember that > > a memory block has hwpoisoned memory inside it. And make any online event > > fail if the onlined memory block contains hwpoison. struct memory_block > > is freed and reallocated over ACPI-based hotremove/hotplug, but not over > > sysfs-based hotremove/hotplug. So it's desirable to implement hwpoison > > counter on this struct. > > > > Note that clear_hwpoisoned_pages() is relocated to be called earlier than > > now, just before unregistering struct memory_block. Otherwise, the > > per-memory_block hwpoison counter is freed and we fail to adjust global > > hwpoison counter properly. > > > > Signed-off-by: Naoya Horiguchi <naoya.horiguchi@xxxxxxx> > > Reported-by: kernel test robot <lkp@xxxxxxxxx> > > LGTM with some nits below. Thanks. > > Reviewed-by: Miaohe Lin <linmiaohe@xxxxxxxxxx> Thank you. > > > --- > > ChangeLog v4 -> v5: > > - add Reported-by of lkp bot, > > - check both CONFIG_MEMORY_FAILURE and CONFIG_MEMORY_HOTPLUG in introduced #ifdefs, > > intending to fix "undefined reference" errors in aarch64. > > > > ChangeLog v3 -> v4: > > - fix build error (https://lore.kernel.org/linux-mm/202209231134.tnhKHRfg-lkp@xxxxxxxxx/) > > by using memblk_nr_poison() to access to the member ->nr_hwpoison > > --- > > drivers/base/memory.c | 34 ++++++++++++++++++++++++++++++++++ > > include/linux/memory.h | 3 +++ > > include/linux/mm.h | 24 ++++++++++++++++++++++++ > > mm/internal.h | 8 -------- > > mm/memory-failure.c | 31 ++++++++++--------------------- > > mm/sparse.c | 2 -- > > 6 files changed, 71 insertions(+), 31 deletions(-) > > > > diff --git a/drivers/base/memory.c b/drivers/base/memory.c > > index 9aa0da991cfb..99e0e789616c 100644 > > --- a/drivers/base/memory.c > > +++ b/drivers/base/memory.c > > @@ -183,6 +183,9 @@ static int memory_block_online(struct memory_block *mem) > > struct zone *zone; > > int ret; > > > > + if (memblk_nr_poison(start_pfn)) > > + return -EHWPOISON; > > + > > zone = zone_for_pfn_range(mem->online_type, mem->nid, mem->group, > > start_pfn, nr_pages); > > > > @@ -864,6 +867,7 @@ void remove_memory_block_devices(unsigned long start, unsigned long size) > > mem = find_memory_block_by_id(block_id); > > if (WARN_ON_ONCE(!mem)) > > continue; > > + clear_hwpoisoned_pages(memblk_nr_poison(start)); > > clear_hwpoisoned_pages seems not a proper name now? PageHWPoison info is kept now. But this should be trivial. > Right, I think that the name num_poisoned_pages_sub() is clear enough, so I'll open this function. > > unregister_memory_block_under_nodes(mem); > > remove_memory_block(mem); > > } > > @@ -1164,3 +1168,33 @@ int walk_dynamic_memory_groups(int nid, walk_memory_groups_func_t func, > > } > > return ret; > > } > > + > > +#if defined(CONFIG_MEMORY_FAILURE) && defined(CONFIG_MEMORY_HOTPLUG) > > +void memblk_nr_poison_inc(unsigned long pfn) > > +{ > > + const unsigned long block_id = pfn_to_block_id(pfn); > > + struct memory_block *mem = find_memory_block_by_id(block_id); > > + > > + if (mem) > > + atomic_long_inc(&mem->nr_hwpoison); > > +} > > + > > +void memblk_nr_poison_sub(unsigned long pfn, long i) > > +{ > > + const unsigned long block_id = pfn_to_block_id(pfn); > > + struct memory_block *mem = find_memory_block_by_id(block_id); > > + > > + if (mem) > > + atomic_long_sub(i, &mem->nr_hwpoison); > > +} > > + > > +unsigned long memblk_nr_poison(unsigned long pfn) > > memblk_nr_poison() is only used inside this file. Make it static? Thanks, I'll add it. Thanks, Naoya Horiguchi