Re: Test case for "mm/thp: carry over dirty bit when thp splits on pmd"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Nov 16, 2022 at 01:45:15PM +0300, Anatoly Pugachev wrote:
> On Wed, Nov 16, 2022 at 11:49 AM hev <r@xxxxxx> wrote:
> >
> > Hello Peter,

Hi, Hev,

Thanks for letting me know.

> >
> > I see a random crash issue  on the LoongArch system, that is caused by
> > commit 0ccf7f1 ("mm/thp: carry over dirty bit when thp splits on
> > pmd").
> >
> > Now, the thing is already resolved. The root cause is arch's mkdirty
> > is set hardware writable bit in unconditional. That breaks
> > write-protect and then breaks COW.

Could you help explain how that happened?

I'm taking example of loongarch here:

static inline pte_t pte_mkdirty(pte_t pte)
{
	pte_val(pte) |= (_PAGE_DIRTY | _PAGE_MODIFIED);
	return pte;
}

#define _PAGE_MODIFIED		(_ULCAST_(1) << _PAGE_MODIFIED_SHIFT)
#define	_PAGE_MODIFIED_SHIFT	9
#define _PAGE_DIRTY		(_ULCAST_(1) << _PAGE_DIRTY_SHIFT)
#define	_PAGE_DIRTY_SHIFT	1

I don't see when write bit is set, which is bit 8 instead:

#define _PAGE_WRITE		(_ULCAST_(1) << _PAGE_WRITE_SHIFT)
#define	_PAGE_WRITE_SHIFT	8

According to loongarch spec:

https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html#section-multi-level-page-table-structure-supported-by-page-walking

Bits 1 & 8 match the spec D & W definitions.  Bit 9 seems not defined but I
didn't quickly spot how that's related to the write bit.

> >
> > Here is a simple and fast testcase (It may be helpful for sparc64):
> > https://gist.github.com/heiher/72919fae6b53f04cac606a9631100506
> > (assertion: c sum == 0)
> 
> Just tried on my sparc64 VM -  fixed vs old (non-patched) kernels...
> 
> fixed kernel (6.1.0-rc5) running ./a.out:
> mator@ttip:~$ ./a.out
> c sum: 0
> p sum: 35184372088832
> c sum: 0
> p sum: 35184372088832
> c sum: 0
> p sum: 35184372088832
> c sum: 0
> p sum: 35184372088832
> c sum: 0
> p sum: 35184372088832
> ...
> 
> old (non-patched) kernel (6.1.0-rc4) :
> mator@ttip:~$ ./a.out
> c sum: 35150012350464
> p sum: 35184372088832
> c sum: 35150012350464
> p sum: 35184372088832
> ...

Thanks for the quick run, Anatoly.  Obviously I went the wrong way before
on the code patching.  It seems we have more chance fixing this.

-- 
Peter Xu




[Index of Archives]     [Kernel Development]     [DCCP]     [Linux ARM Development]     [Linux]     [Photo]     [Yosemite Help]     [Linux ARM Kernel]     [Linux SCSI]     [Linux x86_64]     [Linux Hams]

  Powered by Linux