Hi Peter, On Thu, Nov 17, 2022 at 12:25 AM Peter Xu <peterx@xxxxxxxxxx> wrote: > > On Wed, Nov 16, 2022 at 01:45:15PM +0300, Anatoly Pugachev wrote: > > On Wed, Nov 16, 2022 at 11:49 AM hev <r@xxxxxx> wrote: > > > > > > Hello Peter, > > Hi, Hev, > > Thanks for letting me know. > > > > > > > I see a random crash issue on the LoongArch system, that is caused by > > > commit 0ccf7f1 ("mm/thp: carry over dirty bit when thp splits on > > > pmd"). > > > > > > Now, the thing is already resolved. The root cause is arch's mkdirty > > > is set hardware writable bit in unconditional. That breaks > > > write-protect and then breaks COW. > > Could you help explain how that happened? > > I'm taking example of loongarch here: > > static inline pte_t pte_mkdirty(pte_t pte) > { > pte_val(pte) |= (_PAGE_DIRTY | _PAGE_MODIFIED); > return pte; > } > > #define _PAGE_MODIFIED (_ULCAST_(1) << _PAGE_MODIFIED_SHIFT) > #define _PAGE_MODIFIED_SHIFT 9 _PAGE_MODIFIED is a software dirty bit > #define _PAGE_DIRTY (_ULCAST_(1) << _PAGE_DIRTY_SHIFT) > #define _PAGE_DIRTY_SHIFT 1 _PAGE_DIRTY is a hardware writable bit (bad naming), meaning that mmu allows write memory without any exception raised. > > I don't see when write bit is set, which is bit 8 instead: > > #define _PAGE_WRITE (_ULCAST_(1) << _PAGE_WRITE_SHIFT) > #define _PAGE_WRITE_SHIFT 8 _PAGE_WRITE is a software writable bit (not hardware). As David said, In __split_huge_pmd_locked, the VMA does not include VM_WRITE, entry = maybe_mkwrite(entry, vma); so the pte does not include software writable bit (_PAGE_WRITE). and the dirty is true, if (dirty) entry = pte_mkdirty(entry); so the incorrect arch's pte_mkdirty set hardware writable bit(_PAGE_DIRTY) in unconditional for read-only pages. Regards, Ray > > According to loongarch spec: > > https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html#section-multi-level-page-table-structure-supported-by-page-walking > > Bits 1 & 8 match the spec D & W definitions. Bit 9 seems not defined but I > didn't quickly spot how that's related to the write bit. > > > > > > > Here is a simple and fast testcase (It may be helpful for sparc64): > > > https://gist.github.com/heiher/72919fae6b53f04cac606a9631100506 > > > (assertion: c sum == 0) > > > > Just tried on my sparc64 VM - fixed vs old (non-patched) kernels... > > > > fixed kernel (6.1.0-rc5) running ./a.out: > > mator@ttip:~$ ./a.out > > c sum: 0 > > p sum: 35184372088832 > > c sum: 0 > > p sum: 35184372088832 > > c sum: 0 > > p sum: 35184372088832 > > c sum: 0 > > p sum: 35184372088832 > > c sum: 0 > > p sum: 35184372088832 > > ... > > > > old (non-patched) kernel (6.1.0-rc4) : > > mator@ttip:~$ ./a.out > > c sum: 35150012350464 > > p sum: 35184372088832 > > c sum: 35150012350464 > > p sum: 35184372088832 > > ... > > Thanks for the quick run, Anatoly. Obviously I went the wrong way before > on the code patching. It seems we have more chance fixing this. > > -- > Peter Xu >