On Mon, Jul 10, 2023 at 8:16 AM Jiaqi Yan <jiaqiyan@xxxxxxxxxx> wrote: > > On Fri, Jul 7, 2023 at 7:57 PM Miaohe Lin <linmiaohe@xxxxxxxxxx> wrote: > > > > On 2023/7/8 4:19, Jiaqi Yan wrote: > > > Add the functionality, is_raw_hwp_subpage, to tell if a subpage of a > > > hugetlb folio is a raw HWPOISON page. This functionality relies on > > > RawHwpUnreliable to be not set; otherwise hugepage's raw HWPOISON list > > > becomes meaningless. > > > > > > is_raw_hwp_subpage needs to hold hugetlb_lock in order to synchronize > > > with __get_huge_page_for_hwpoison, who iterates and inserts an entry to > > > raw_hwp_list. llist itself doesn't ensure insertion is synchornized with > > > the iterating used by __is_raw_hwp_list. Caller can minimize the > > > overhead of lock cycles by first checking if folio / head page's > > > HWPOISON flag is set. > > > > > > Exports this functionality to be immediately used in the read operation > > > for hugetlbfs. > > > > > > Reviewed-by: Mike Kravetz <mike.kravetz@xxxxxxxxxx> > > > Reviewed-by: Naoya Horiguchi <naoya.horiguchi@xxxxxxx> > > > Signed-off-by: Jiaqi Yan <jiaqiyan@xxxxxxxxxx> > > > --- > > > include/linux/hugetlb.h | 19 +++++++++++++++++++ > > > include/linux/mm.h | 7 +++++++ > > > mm/hugetlb.c | 10 ++++++++++ > > > mm/memory-failure.c | 34 ++++++++++++++++++++++++---------- > > > 4 files changed, 60 insertions(+), 10 deletions(-) > > > ... > > > -static inline struct llist_head *raw_hwp_list_head(struct folio *folio) > > > +bool __is_raw_hwp_subpage(struct folio *folio, struct page *subpage) > > > { > > > - return (struct llist_head *)&folio->_hugetlb_hwpoison; > > > + struct llist_head *raw_hwp_head; > > > + struct raw_hwp_page *p, *tmp; > > > + bool ret = false; > > > + > > > + if (!folio_test_hwpoison(folio)) > > > + return false; > > > + > > > + /* > > > + * When RawHwpUnreliable is set, kernel lost track of which subpages > > > + * are HWPOISON. So return as if ALL subpages are HWPOISONed. > > > + */ > > > + if (folio_test_hugetlb_raw_hwp_unreliable(folio)) > > > + return true; > > > + > > > + raw_hwp_head = raw_hwp_list_head(folio); > > > + llist_for_each_entry_safe(p, tmp, raw_hwp_head->first, node) { > > > > Since we don't free the raw_hwp_list, does llist_for_each_entry works same as llist_for_each_entry_safe? Sorry I missed this comment. Yes they are the same but llist_for_each_entry doesn't need "tmp". I will switch to llist_for_each_entry in v4. > > > > > > + if (subpage == p->page) { > > > + ret = true; > > > + break; > > > + } > > > + } > > > + > > > + return ret; > > > } > > > > It seems there's a race between __is_raw_hwp_subpage and unpoison_memory: > > unpoison_memory __is_raw_hwp_subpage > > if (!folio_test_hwpoison(folio)) -- hwpoison is set > > folio_free_raw_hwp llist_for_each_entry_safe raw_hwp_list > > llist_del_all .. > > folio_test_clear_hwpoison > > > > Thanks Miaohe for raising this concern. > > > But __is_raw_hwp_subpage is used in hugetlbfs, unpoison_memory couldn't reach here because there's a > > folio_mapping == NULL check before folio_free_raw_hwp. > > I agree. But in near future I do want to make __is_raw_hwp_subpage > work for shared-mapping hugetlb, so it would be nice to work with > unpoison_memory. It doesn't seem to me that holding mf_mutex in > __is_raw_hwp_subpage is nice or even absolutely correct. Let me think > if I can come up with something in v4. At my 2nd thought, if __is_raw_hwp_subpage simply takes mf_mutex before llist_for_each_entry, it will introduce a deadlock: unpoison_memory __is_raw_hwp_subpage held mf_mutex held hugetlb_lock get_hwpoison_hugetlb_folio attempts mf_mutex attempts hugetlb lock Not for this patch series, but for future, is it a good idea to make mf_mutex available to hugetlb code? Then enforce the order of locking to be mf_mutex first, hugetlb_lock second? I believe this is the current locking pattern / order for try_memory_failure_hugetlb. > > > > > Anyway, this patch looks good to me. > > > > Reviewed-by: Miaohe Lin <linmiaohe@xxxxxxxxxx> > > Thanks. > >