On 2019/3/1 15:29, Naoya Horiguchi wrote: > On Tue, Feb 26, 2019 at 10:34:32PM +0800, zhong jiang wrote: >> On 2019/2/26 21:51, Kirill A. Shutemov wrote: >>> On Tue, Feb 26, 2019 at 07:18:00PM +0800, zhong jiang wrote: >>>> From: zhongjiang <zhongjiang@xxxxxxxxxx> >>>> >>>> When soft_offline_in_use_page() runs on a thp tail page after pmd is plit, >>> s/plit/split/ >>> >>>> we trigger the following VM_BUG_ON_PAGE(): >>>> >>>> Memory failure: 0x3755ff: non anonymous thp >>>> __get_any_page: 0x3755ff: unknown zero refcount page type 2fffff80000000 >>>> Soft offlining pfn 0x34d805 at process virtual address 0x20fff000 >>>> page:ffffea000d360140 count:0 mapcount:0 mapping:0000000000000000 index:0x1 >>>> flags: 0x2fffff80000000() >>>> raw: 002fffff80000000 ffffea000d360108 ffffea000d360188 0000000000000000 >>>> raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000 >>>> page dumped because: VM_BUG_ON_PAGE(page_ref_count(page) == 0) >>>> ------------[ cut here ]------------ >>>> kernel BUG at ./include/linux/mm.h:519! >>>> >>>> soft_offline_in_use_page() passed refcount and page lock from tail page to >>>> head page, which is not needed because we can pass any subpage to >>>> split_huge_page(). >>> I don't see a description of what is going wrong and why change will fixed >>> it. From the description, it appears as it's cosmetic-only change. >>> >>> Please elaborate. >> When soft_offline_in_use_page runs on a thp tail page after pmd is split, >> and we pass the head page to split_huge_page, Unfortunately, the tail page >> can be free or count turn into zero. > I guess that you have the similar fix on memory_failure() in your mind: > > commit c3901e722b2975666f42748340df798114742d6d > Author: Naoya Horiguchi <n-horiguchi@xxxxxxxxxxxxx> > Date: Thu Nov 10 10:46:23 2016 -0800 > > mm: hwpoison: fix thp split handling in memory_failure() > > So it seems that I somehow missed fixing soft offline when I wrote commit > c3901e722b29, and now you find and fix that. Thank you very much. > If you resend the patch with fixing typo, can you add some reference to > c3901e722b29 in the patch description to show the linkage? > And you can add the following tags: Yep, I find that that is a similar issue. hence I refer to that description in the patch you had mentioned. I will add the above desprition you had mentioned in V2. Thanks, zhong jiang > Fixes: 61f5d698cc97 ("mm: re-enable THP") > Acked-by: Naoya Horiguchi <n-horiguchi@xxxxxxxxxxxxx> > > Thanks, > Naoya Horiguchi > > . >