On Thu, Aug 22, 2024 at 2:46 AM Hao Ge <hao.ge@xxxxxxxxx> wrote: > > Hi Miaohe > > > Thank you for taking the time to review this patch. > > > On 8/22/24 16:04, Miaohe Lin wrote: > > On 2024/8/22 10:58, Hao Ge wrote: > >> From: Hao Ge <gehao@xxxxxxxxxx> > >> > > Thanks for your patch. > > > >> The PG_hwpoison page will be caught and isolated on the entrance to > >> the free buddy page pool. so,when we clear this flag and return it > >> to the buddy system,mark codetags for pages as empty. > >> > > Is below scene cause the problem? > > > > 1. Pages are allocated. pgalloc_tag_add() will be called when prep_new_page(). > > > > 2. Pages are hwpoisoned. memory_failure() will set PG_hwpoison flag and pgalloc_tag_sub() > > will be called when pages are caught and isolated on the entrance to buddy. Hi Folks, Thanks for reporting this! Could you please describe in more details how memory_failure() ends up calling pgalloc_tag_sub()? It's not obvious to me which path leads to pgalloc_tag_sub(), so I must be missing something. On a conceptual level I want to understand if the page isolated in this manner should be considered freed or not. If it shouldn't be considered free then I think the right fix would be to avoid pgalloc_tag_sub() when this isolation happens. Thanks, Suren. > > > > 3. unpoison_memory cleared flags and sent the pages to buddy. pgalloc_tag_sub() will be > > called again in free_pages_prepare(). > > > > So there is a imbalance that pgalloc_tag_add() is called once and pgalloc_tag_sub() is called twice? > As you said, that's exactly the case. > > > > If so, let's think about more complicated scene: > > > > 1. Same as above. > > > > 2. Pages are hwpoisoned. But memory_failure() fails to handle it. So PG_hwpoison flag is set > > but pgalloc_tag_sub() is not called (pages are not sent to buddy). > > > > 3. unpoison_memory cleared flags and calls clear_page_tag_ref() without calling pgalloc_tag_sub() > > first. Will this cause problem? > > > > Though this should be really rare... > > > > Thanks. > > . > > Great, I didn't anticipate this scenario. > > When we call clear_page_tag_ref() without calling pgalloc_tag_sub(), > > It will cause exceptions in|tag->counters->bytes|and|tag->counters->calls|. > > We can add a layer of protection to handle it > > The pseudocode is as follows: > > if (mem_alloc_profiling_enabled()) { > union codetag_ref *ref = get_page_tag_ref(page); > > if (ref) { > if( ref->ct != NULL && !is_codetag_empty(ref)) > { > tag = ct_to_alloc_tag(ref->ct); > this_cpu_sub(tag->counters->bytes, bytes); > this_cpu_dec(tag->counters->calls); > } > set_codetag_empty(ref); > put_page_tag_ref(ref); > } > } > > Hi Suren and Kent > > Do you have any suggestions for this? If it's okay, I'll add comments > and include this pseudocode in|clear_page_tag_ref|. > > >> It was detected by [1] and the following WARN occurred: > >> > >> [ 113.930443][ T3282] ------------[ cut here ]------------ > >> [ 113.931105][ T3282] alloc_tag was not set > >> [ 113.931576][ T3282] WARNING: CPU: 2 PID: 3282 at ./include/linux/alloc_tag.h:130 pgalloc_tag_sub.part.66+0x154/0x164 > >> [ 113.932866][ T3282] Modules linked in: hwpoison_inject fuse ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute ip6table_nat ip6table_man4 > >> [ 113.941638][ T3282] CPU: 2 UID: 0 PID: 3282 Comm: madvise11 Kdump: loaded Tainted: G W 6.11.0-rc4-dirty #18 > >> [ 113.943003][ T3282] Tainted: [W]=WARN > >> [ 113.943453][ T3282] Hardware name: QEMU KVM Virtual Machine, BIOS unknown 2/2/2022 > >> [ 113.944378][ T3282] pstate: 40400005 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) > >> [ 113.945319][ T3282] pc : pgalloc_tag_sub.part.66+0x154/0x164 > >> [ 113.946016][ T3282] lr : pgalloc_tag_sub.part.66+0x154/0x164 > >> [ 113.946706][ T3282] sp : ffff800087093a10 > >> [ 113.947197][ T3282] x29: ffff800087093a10 x28: ffff0000d7a9d400 x27: ffff80008249f0a0 > >> [ 113.948165][ T3282] x26: 0000000000000000 x25: ffff80008249f2b0 x24: 0000000000000000 > >> [ 113.949134][ T3282] x23: 0000000000000001 x22: 0000000000000001 x21: 0000000000000000 > >> [ 113.950597][ T3282] x20: ffff0000c08fcad8 x19: ffff80008251e000 x18: ffffffffffffffff > >> [ 113.952207][ T3282] x17: 0000000000000000 x16: 0000000000000000 x15: ffff800081746210 > >> [ 113.953161][ T3282] x14: 0000000000000000 x13: 205d323832335420 x12: 5b5d353031313339 > >> [ 113.954120][ T3282] x11: ffff800087093500 x10: 000000000000005d x9 : 00000000ffffffd0 > >> [ 113.955078][ T3282] x8 : 7f7f7f7f7f7f7f7f x7 : ffff80008236ba90 x6 : c0000000ffff7fff > >> [ 113.956036][ T3282] x5 : ffff000b34bf4dc8 x4 : ffff8000820aba90 x3 : 0000000000000001 > >> [ 113.956994][ T3282] x2 : ffff800ab320f000 x1 : 841d1e35ac932e00 x0 : 0000000000000000 > >> [ 113.957962][ T3282] Call trace: > >> [ 113.958350][ T3282] pgalloc_tag_sub.part.66+0x154/0x164 > >> [ 113.959000][ T3282] pgalloc_tag_sub+0x14/0x1c > >> [ 113.959539][ T3282] free_unref_page+0xf4/0x4b8 > >> [ 113.960096][ T3282] __folio_put+0xd4/0x120 > >> [ 113.960614][ T3282] folio_put+0x24/0x50 > >> [ 113.961103][ T3282] unpoison_memory+0x4f0/0x5b0 > >> [ 113.961678][ T3282] hwpoison_unpoison+0x30/0x48 [hwpoison_inject] > >> [ 113.962436][ T3282] simple_attr_write_xsigned.isra.34+0xec/0x1cc > >> [ 113.963183][ T3282] simple_attr_write+0x38/0x48 > >> [ 113.963750][ T3282] debugfs_attr_write+0x54/0x80 > >> [ 113.964330][ T3282] full_proxy_write+0x68/0x98 > >> [ 113.964880][ T3282] vfs_write+0xdc/0x4d0 > >> [ 113.965372][ T3282] ksys_write+0x78/0x100 > >> [ 113.965875][ T3282] __arm64_sys_write+0x24/0x30 > >> [ 113.966440][ T3282] invoke_syscall+0x7c/0x104 > >> [ 113.966984][ T3282] el0_svc_common.constprop.1+0x88/0x104 > >> [ 113.967652][ T3282] do_el0_svc+0x2c/0x38 > >> [ 113.968893][ T3282] el0_svc+0x3c/0x1b8 > >> [ 113.969379][ T3282] el0t_64_sync_handler+0x98/0xbc > >> [ 113.969980][ T3282] el0t_64_sync+0x19c/0x1a0 > >> [ 113.970511][ T3282] ---[ end trace 0000000000000000 ]--- > >> > >> Link [1]: https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/syscalls/madvise/madvise11.c > >> > >> Fixes: a8fc28dad6d5 ("alloc_tag: introduce clear_page_tag_ref() helper function") > >> Cc: stable@xxxxxxxxxxxxxxx # v6.10 > >> Signed-off-by: Hao Ge <gehao@xxxxxxxxxx> > >> --- > >> mm/memory-failure.c | 6 ++++++ > >> 1 file changed, 6 insertions(+) > >> > >> diff --git a/mm/memory-failure.c b/mm/memory-failure.c > >> index 7066fc84f351..570388c41532 100644 > >> --- a/mm/memory-failure.c > >> +++ b/mm/memory-failure.c > >> @@ -2623,6 +2623,12 @@ int unpoison_memory(unsigned long pfn) > >> > >> folio_put(folio); > >> if (TestClearPageHWPoison(p)) { > >> + /* the PG_hwpoison page will be caught and isolated > >> + * on the entrance to the free buddy page pool. > >> + * so,when we clear this flag and return it to the buddy system, > >> + * clear it's codetag > >> + */ > >> + clear_page_tag_ref(p); > >> folio_put(folio); > >> ret = 0; > >> } > >> > >> > Thanks > > BR > > Hao >