On 06/08/2017 03:44 PM, Xishi Qiu wrote: > On 2017/5/23 17:33, Vlastimil Babka wrote: > >> On 05/23/2017 11:21 AM, zhong jiang wrote: >>> On 2017/5/23 0:51, Vlastimil Babka wrote: >>>> On 05/20/2017 05:01 AM, zhong jiang wrote: >>>>> On 2017/5/20 10:40, Hugh Dickins wrote: >>>>>> On Sat, 20 May 2017, Xishi Qiu wrote: >>>>>>> Here is a bug report form redhat: https://bugzilla.redhat.com/show_bug.cgi?id=1305620 >>>>>>> And I meet the bug too. However it is hard to reproduce, and >>>>>>> 624483f3ea82598("mm: rmap: fix use-after-free in __put_anon_vma") is not help. >>>>>>> >>>>>>> From the vmcore, it seems that the page is still mapped(_mapcount=0 and _count=2), >>>>>>> and the value of mapping is a valid address(mapping = 0xffff8801b3e2a101), >>>>>>> but anon_vma has been corrupted. >>>>>>> >>>>>>> Any ideas? >>>>>> Sorry, no. I assume that _mapcount has been misaccounted, for example >>>>>> a pte mapped in on top of another pte; but cannot begin tell you where >>>>>> in Red Hat's kernel-3.10.0-229.4.2.el7 that might happen. >>>>>> >>>>>> Hugh >>>>>> >>>>>> . >>>>>> >>>>> Hi, Hugh >>>>> >>>>> I find the following message from the dmesg. >>>>> >>>>> [26068.316592] BUG: Bad rss-counter state mm:ffff8800a7de2d80 idx:1 val:1 >>>>> >>>>> I can prove that the __mapcount is misaccount. when task is exited. the rmap >>>>> still exist. >>>> Check if the kernel in question contains this commit: ad33bb04b2a6 ("mm: >>>> thp: fix SMP race condition between THP page fault and MADV_DONTNEED") >>> HI, Vlastimil >>> >>> I miss the patch. >> >> Try applying it then, there's good chance the error and crash will go >> away. Even if your workload doesn't actually run any madvise(MADV_DONTNEED). >> > > Hi Vlastimil, > > I find this error was reported by Kirill as following, right? > https://patchwork.kernel.org/patch/7550401/ That was reported by Minchan. > The call trace is quite like the same as ours. In that thread, the error seems just disappeared in the end. So, did you apply the patch I suggested? Did it help? > Thanks, > Xishi Qiu > >>> when I read the patch. I find the following issue. but I am sure it is right. >>> >>> if (unlikely(pmd_trans_unstable(pmd))) >>> return 0; >>> /* >>> * A regular pmd is established and it can't morph into a huge pmd >>> * from under us anymore at this point because we hold the mmap_sem >>> * read mode and khugepaged takes it in write mode. So now it's >>> * safe to run pte_offset_map(). >>> */ >>> pte = pte_offset_map(pmd, address); >>> >>> after pmd_trans_unstable call, without any protect method. by the comments, >>> it think the pte_offset_map is safe. before pte_offset_map call, it still may be >>> unstable. it is possible? >> >> IIRC it's "unstable" wrt possible none->huge->none transition. But once >> we've seen it's a regular pmd via pmd_trans_unstable(), we're safe as a >> transition from regular pmd can't happen. >> >>> Thanks >>> zhongjiang >>>>> Thanks >>>>> zhongjiang >>>>> >>>>> -- >>>>> To unsubscribe, send a message with 'unsubscribe linux-mm' in >>>>> the body to majordomo@xxxxxxxxx. For more info on Linux MM, >>>>> see: http://www.linux-mm.org/ . >>>>> Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a> >>>>> >>>> >>>> . >>>> >>> >>> >>> -- >>> To unsubscribe, send a message with 'unsubscribe linux-mm' in >>> the body to majordomo@xxxxxxxxx. For more info on Linux MM, >>> see: http://www.linux-mm.org/ . >>> Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a> >>> >> >> >> . >> > > > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>