On 4/2/21 10:59 PM, Muchun Song wrote: > On Sat, Apr 3, 2021 at 4:56 AM Mike Kravetz <mike.kravetz@xxxxxxxxxx> wrote: >> >> On 4/2/21 5:47 AM, Muchun Song wrote: >>> On Wed, Mar 31, 2021 at 11:42 AM Mike Kravetz <mike.kravetz@xxxxxxxxxx> wrote: >>>> >>>> Commit c77c0a8ac4c5 ("mm/hugetlb: defer freeing of huge pages if in >>>> non-task context") was added to address the issue of free_huge_page >>>> being called from irq context. That commit hands off free_huge_page >>>> processing to a workqueue if !in_task. However, this doesn't cover >>>> all the cases as pointed out by 0day bot lockdep report [1]. >>>> >>>> : Possible interrupt unsafe locking scenario: >>>> : >>>> : CPU0 CPU1 >>>> : ---- ---- >>>> : lock(hugetlb_lock); >>>> : local_irq_disable(); >>>> : lock(slock-AF_INET); >>>> : lock(hugetlb_lock); >>>> : <Interrupt> >>>> : lock(slock-AF_INET); >>>> >>>> Shakeel has later explained that this is very likely TCP TX zerocopy >>>> from hugetlb pages scenario when the networking code drops a last >>>> reference to hugetlb page while having IRQ disabled. Hugetlb freeing >>>> path doesn't disable IRQ while holding hugetlb_lock so a lock dependency >>>> chain can lead to a deadlock. >>>> >>>> This commit addresses the issue by doing the following: >>>> - Make hugetlb_lock irq safe. This is mostly a simple process of >>>> changing spin_*lock calls to spin_*lock_irq* calls. >>>> - Make subpool lock irq safe in a similar manner. >>>> - Revert the !in_task check and workqueue handoff. >>>> >>>> [1] https://lore.kernel.org/linux-mm/000000000000f1c03b05bc43aadc@xxxxxxxxxx/ >>>> >>>> Signed-off-by: Mike Kravetz <mike.kravetz@xxxxxxxxxx> >>>> Acked-by: Michal Hocko <mhocko@xxxxxxxx> >>>> Reviewed-by: Muchun Song <songmuchun@xxxxxxxxxxxxx> >>> >>> Hi Mike, >>> >>> Today I pulled the newest code (next-20210401). I found that >>> alloc_and_dissolve_huge_page is not updated. In this function, >>> hugetlb_lock is still non-irq safe. Maybe you or Oscar need >>> to fix. >>> >>> Thanks. >> >> Thank you Muchun, >> >> Oscar's changes were not in Andrew's tree when I started on this series >> and I failed to notice their inclusion. In addition, >> isolate_or_dissolve_huge_page also needs updating as well as a change in >> set_max_huge_pages that was omitted while rebasing. >> >> Andrew, the following patch addresses those missing changes. Ideally, >> the changes should be combined/included in this patch. If you want me >> to sent another version of this patch or another series, let me know. >> >> From 450593eb3cea895f499ddc343c22424c552ea502 Mon Sep 17 00:00:00 2001 >> From: Mike Kravetz <mike.kravetz@xxxxxxxxxx> >> Date: Fri, 2 Apr 2021 13:18:13 -0700 >> Subject: [PATCH] hugetlb: fix irq locking omissions >> >> The pach "hugetlb: make free_huge_page irq safe" changed spin_*lock >> calls to spin_*lock_irq* calls. However, it missed several places >> in the file hugetlb.c. Add the overlooked changes. >> >> Signed-off-by: Mike Kravetz <mike.kravetz@xxxxxxxxxx> > > Thanks MIke. But there are still some places that need > improvement. See below. > Correct. My apologies again for not fully taking into account the new code from Oscar's series when working on this. >> --- >> mm/hugetlb.c | 16 ++++++++-------- >> 1 file changed, 8 insertions(+), 8 deletions(-) >> >> diff --git a/mm/hugetlb.c b/mm/hugetlb.c >> index c22111f3da20..a6bfc6bcbc81 100644 >> --- a/mm/hugetlb.c >> +++ b/mm/hugetlb.c >> @@ -2284,7 +2284,7 @@ static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page, >> */ >> page_ref_dec(new_page); >> retry: >> - spin_lock(&hugetlb_lock); >> + spin_lock_irq(&hugetlb_lock); >> if (!PageHuge(old_page)) { >> /* >> * Freed from under us. Drop new_page too. >> @@ -2297,7 +2297,7 @@ static int alloc_and_dissolve_huge_page(struct hstate *h, struct page *old_page, >> * Fail with -EBUSY if not possible. >> */ >> update_and_free_page(h, new_page); > > Now update_and_free_page can be called without holding > hugetlb_lock. We can move it out of hugetlb_lock. In this > function, there are 3 places which call update_and_free_page(). > We can move all of them out of hugetlb_lock. Right? We will need to do more than that. The call to update_and_free_page in alloc_and_dissolve_huge_page assumes old functionality. i.e. It assumes h->nr_huge_pages will be decremented in update_and_free_page. This is no longer the case. This will need to be fixed in patch 4 of my series which changes the functionality of update_and_free_page. I'm afraid a change there will end up requiring changes in subsequent patches due to context. I will have an update on Monday. -- Mike Kravetz