Hi Kirill, On 4/25/2023 8:38 PM, Kirill A. Shutemov wrote: > On Tue, Apr 25, 2023 at 04:46:26PM +0800, Yin Fengwei wrote: >> free_transhuge_page() acquires split queue lock then check >> whether the THP was added to deferred list or not. >> >> It's safe to check whether the THP is in deferred list or not. >> When code hit free_transhuge_page(), there is no one tries >> to update the folio's _deferred_list. >> >> If folio is not in deferred_list, it's safe to check without >> acquiring lock. >> >> If folio is in deferred_list, the other node in deferred_list >> adding/deleteing doesn't impact the return value of >> list_epmty(@folio->_deferred_list). > > Typo. > >> >> Running page_fault1 of will-it-scale + order 2 folio for anonymous >> mapping with 96 processes on an Ice Lake 48C/96T test box, we could >> see the 61% split_queue_lock contention: >> - 71.28% 0.35% page_fault1_pro [kernel.kallsyms] [k] >> release_pages >> - 70.93% release_pages >> - 61.42% free_transhuge_page >> + 60.77% _raw_spin_lock_irqsave >> >> With this patch applied, the split_queue_lock contention is less >> than 1%. >> >> Signed-off-by: Yin Fengwei <fengwei.yin@xxxxxxxxx> >> Tested-by: Ryan Roberts <ryan.roberts@xxxxxxx> >> --- >> mm/huge_memory.c | 19 ++++++++++++++++--- >> 1 file changed, 16 insertions(+), 3 deletions(-) >> >> diff --git a/mm/huge_memory.c b/mm/huge_memory.c >> index 032fb0ef9cd1..c620f1f12247 100644 >> --- a/mm/huge_memory.c >> +++ b/mm/huge_memory.c >> @@ -2799,12 +2799,25 @@ void free_transhuge_page(struct page *page) >> struct deferred_split *ds_queue = get_deferred_split_queue(folio); >> unsigned long flags; >> >> - spin_lock_irqsave(&ds_queue->split_queue_lock, flags); >> - if (!list_empty(&folio->_deferred_list)) { >> + /* >> + * At this point, there is no one trying to queue the folio >> + * to deferred_list. folio->_deferred_list is not possible >> + * being updated. >> + * >> + * If folio is already added to deferred_list, add/delete to/from >> + * deferred_list will not impact list_empty(&folio->_deferred_list). >> + * It's safe to check list_empty(&folio->_deferred_list) without >> + * acquiring the lock. >> + * >> + * If folio is not in deferred_list, it's safe to check without >> + * acquiring the lock. >> + */ >> + if (data_race(!list_empty(&folio->_deferred_list))) { >> + spin_lock_irqsave(&ds_queue->split_queue_lock, flags); > > Recheck under lock? In function deferred_split_scan(), there is following code block: if (folio_try_get(folio)) { list_move(&folio->_deferred_list, &list); } else { /* We lost race with folio_put() */ list_del_init(&folio->_deferred_list); ds_queue->split_queue_len--; } I am wondering what kind of "lost race with folio_put()" can be. My understanding is that it's not necessary to handle this case here because free_transhuge_page() will handle it once folio get zero ref. But I must miss something here. Thanks. Regards Yin, Fengwei > >> ds_queue->split_queue_len--; >> list_del(&folio->_deferred_list); >> + spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags); >> } >> - spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags); >> free_compound_page(page); >> } >> >> -- >> 2.30.2 >> >> >