On 07/11/23 10:05, Jiaqi Yan wrote: > On Mon, Jul 10, 2023 at 8:16 AM Jiaqi Yan <jiaqiyan@xxxxxxxxxx> wrote: > > On Fri, Jul 7, 2023 at 7:57 PM Miaohe Lin <linmiaohe@xxxxxxxxxx> wrote: > > > On 2023/7/8 4:19, Jiaqi Yan wrote: > > > > > > > + if (subpage == p->page) { > > > > + ret = true; > > > > + break; > > > > + } > > > > + } > > > > + > > > > + return ret; > > > > } > > > > > > It seems there's a race between __is_raw_hwp_subpage and unpoison_memory: > > > unpoison_memory __is_raw_hwp_subpage > > > if (!folio_test_hwpoison(folio)) -- hwpoison is set > > > folio_free_raw_hwp llist_for_each_entry_safe raw_hwp_list > > > llist_del_all .. > > > folio_test_clear_hwpoison > > > > > > > Thanks Miaohe for raising this concern. > > > > > But __is_raw_hwp_subpage is used in hugetlbfs, unpoison_memory couldn't reach here because there's a > > > folio_mapping == NULL check before folio_free_raw_hwp. > > > > I agree. But in near future I do want to make __is_raw_hwp_subpage > > work for shared-mapping hugetlb, so it would be nice to work with > > unpoison_memory. It doesn't seem to me that holding mf_mutex in > > __is_raw_hwp_subpage is nice or even absolutely correct. Let me think > > if I can come up with something in v4. > > At my 2nd thought, if __is_raw_hwp_subpage simply takes mf_mutex > before llist_for_each_entry, it will introduce a deadlock: > > unpoison_memory __is_raw_hwp_subpage > held mf_mutex held hugetlb_lock > get_hwpoison_hugetlb_folio attempts mf_mutex > attempts hugetlb lock > > Not for this patch series, but for future, is it a good idea to make > mf_mutex available to hugetlb code? Then enforce the order of locking > to be mf_mutex first, hugetlb_lock second? I believe this is the > current locking pattern / order for try_memory_failure_hugetlb. I think only holding mf_mutex in __is_raw_hwp_subpage would be sufficient to prevent races with unpoison_memory. memory failure code needs to take both mf_mutex and hugetlb_lock. The hugetlb lock is to prevent hugetlb page state changes. IIUC, __is_raw_hwp_subpage is only taking hugetlb_lock to prevent races with memory failure code. Of course, I could be missing something as there are subtle issues with locking in the memory failure code. -- Mike Kravetz