On Fri, Nov 25, 2022 at 07:38:36PM +0800, hev wrote: > On Tue, Nov 22, 2022 at 2:55 AM Peter Xu <peterx@xxxxxxxxxx> wrote: > > > > Hi, Anatoly (or/and Hev), > > > > On Wed, Nov 16, 2022 at 01:45:15PM +0300, Anatoly Pugachev wrote: > > > On Wed, Nov 16, 2022 at 11:49 AM hev <r@xxxxxx> wrote: > > > > > > > > Hello Peter, > > > > > > > > I see a random crash issue on the LoongArch system, that is caused by > > > > commit 0ccf7f1 ("mm/thp: carry over dirty bit when thp splits on > > > > pmd"). > > > > > > > > Now, the thing is already resolved. The root cause is arch's mkdirty > > > > is set hardware writable bit in unconditional. That breaks > > > > write-protect and then breaks COW. > > > > > > > > Here is a simple and fast testcase (It may be helpful for sparc64): > > > > https://gist.github.com/heiher/72919fae6b53f04cac606a9631100506 > > > > (assertion: c sum == 0) > > > > > > Just tried on my sparc64 VM - fixed vs old (non-patched) kernels... > > > > > > fixed kernel (6.1.0-rc5) running ./a.out: > > > mator@ttip:~$ ./a.out > > > c sum: 0 > > > p sum: 35184372088832 > > > c sum: 0 > > > p sum: 35184372088832 > > > c sum: 0 > > > p sum: 35184372088832 > > > c sum: 0 > > > p sum: 35184372088832 > > > c sum: 0 > > > p sum: 35184372088832 > > > ... > > > > > > old (non-patched) kernel (6.1.0-rc4) : > > > mator@ttip:~$ ./a.out > > > c sum: 35150012350464 > > > p sum: 35184372088832 > > > c sum: 35150012350464 > > > p sum: 35184372088832 > > > ... > > > > I've got another patch attached that might be nicer to fix this same > > problem for both archs but without dropping the dirty bit, could you help > > check whether it works? > > The tesecase PASSED with this patch and without: > * "Partly revert "mm/thp: carry over dirty bit when thp splits on pmd" > * "LoongArch: Set _PAGE_DIRTY only if _PAGE_WRITE is set in > {pmd,pte}_mkdirty()" My fault to not have noticed that the partly revert patch already landed 6.1-rc5, so it'll need to be another patch upon it. I'll post a formal patch. Thanks Hev. -- Peter Xu