On 2017/5/23 17:33, Vlastimil Babka wrote: > On 05/23/2017 11:21 AM, zhong jiang wrote: >> On 2017/5/23 0:51, Vlastimil Babka wrote: >>> On 05/20/2017 05:01 AM, zhong jiang wrote: >>>> On 2017/5/20 10:40, Hugh Dickins wrote: >>>>> On Sat, 20 May 2017, Xishi Qiu wrote: >>>>>> Here is a bug report form redhat: https://bugzilla.redhat.com/show_bug.cgi?id=1305620 >>>>>> And I meet the bug too. However it is hard to reproduce, and >>>>>> 624483f3ea82598("mm: rmap: fix use-after-free in __put_anon_vma") is not help. >>>>>> >>>>>> From the vmcore, it seems that the page is still mapped(_mapcount=0 and _count=2), >>>>>> and the value of mapping is a valid address(mapping = 0xffff8801b3e2a101), >>>>>> but anon_vma has been corrupted. >>>>>> >>>>>> Any ideas? >>>>> Sorry, no. I assume that _mapcount has been misaccounted, for example >>>>> a pte mapped in on top of another pte; but cannot begin tell you where >>>>> in Red Hat's kernel-3.10.0-229.4.2.el7 that might happen. >>>>> >>>>> Hugh >>>>> >>>>> . >>>>> >>>> Hi, Hugh >>>> >>>> I find the following message from the dmesg. >>>> >>>> [26068.316592] BUG: Bad rss-counter state mm:ffff8800a7de2d80 idx:1 val:1 >>>> >>>> I can prove that the __mapcount is misaccount. when task is exited. the rmap >>>> still exist. >>> Check if the kernel in question contains this commit: ad33bb04b2a6 ("mm: >>> thp: fix SMP race condition between THP page fault and MADV_DONTNEED") >> HI, Vlastimil >> >> I miss the patch. > > Try applying it then, there's good chance the error and crash will go > away. Even if your workload doesn't actually run any madvise(MADV_DONTNEED). > Hi Vlastimil, I find this error was reported by Kirill as following, right? https://patchwork.kernel.org/patch/7550401/ The call trace is quite like the same as ours. Thanks, Xishi Qiu >> when I read the patch. I find the following issue. but I am sure it is right. >> >> if (unlikely(pmd_trans_unstable(pmd))) >> return 0; >> /* >> * A regular pmd is established and it can't morph into a huge pmd >> * from under us anymore at this point because we hold the mmap_sem >> * read mode and khugepaged takes it in write mode. So now it's >> * safe to run pte_offset_map(). >> */ >> pte = pte_offset_map(pmd, address); >> >> after pmd_trans_unstable call, without any protect method. by the comments, >> it think the pte_offset_map is safe. before pte_offset_map call, it still may be >> unstable. it is possible? > > IIRC it's "unstable" wrt possible none->huge->none transition. But once > we've seen it's a regular pmd via pmd_trans_unstable(), we're safe as a > transition from regular pmd can't happen. > >> Thanks >> zhongjiang >>>> Thanks >>>> zhongjiang >>>> >>>> -- >>>> To unsubscribe, send a message with 'unsubscribe linux-mm' in >>>> the body to majordomo@xxxxxxxxx. For more info on Linux MM, >>>> see: http://www.linux-mm.org/ . >>>> Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a> >>>> >>> >>> . >>> >> >> >> -- >> To unsubscribe, send a message with 'unsubscribe linux-mm' in >> the body to majordomo@xxxxxxxxx. For more info on Linux MM, >> see: http://www.linux-mm.org/ . >> Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a> >> > > > . > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>