Re: Test case for "mm/thp: carry over dirty bit when thp splits on pmd"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi, Peter,

On Fri, Nov 18, 2022 at 2:29 AM Peter Xu <peterx@xxxxxxxxxx> wrote:
>
> On Thu, Nov 17, 2022 at 10:29:57AM +0800, hev wrote:
> > Hi Peter,
>
> Hi, Hev,
>
> >
> > On Thu, Nov 17, 2022 at 12:25 AM Peter Xu <peterx@xxxxxxxxxx> wrote:
> > >
> > > On Wed, Nov 16, 2022 at 01:45:15PM +0300, Anatoly Pugachev wrote:
> > > > On Wed, Nov 16, 2022 at 11:49 AM hev <r@xxxxxx> wrote:
> > > > >
> > > > > Hello Peter,
> > >
> > > Hi, Hev,
> > >
> > > Thanks for letting me know.
> > >
> > > > >
> > > > > I see a random crash issue  on the LoongArch system, that is caused by
> > > > > commit 0ccf7f1 ("mm/thp: carry over dirty bit when thp splits on
> > > > > pmd").
> > > > >
> > > > > Now, the thing is already resolved. The root cause is arch's mkdirty
> > > > > is set hardware writable bit in unconditional. That breaks
> > > > > write-protect and then breaks COW.
> > >
> > > Could you help explain how that happened?
> > >
> > > I'm taking example of loongarch here:
> > >
> > > static inline pte_t pte_mkdirty(pte_t pte)
> > > {
> > >         pte_val(pte) |= (_PAGE_DIRTY | _PAGE_MODIFIED);
> > >         return pte;
> > > }
> > >
> > > #define _PAGE_MODIFIED          (_ULCAST_(1) << _PAGE_MODIFIED_SHIFT)
> > > #define _PAGE_MODIFIED_SHIFT    9
> >
> > _PAGE_MODIFIED is a software dirty bit
> >
> > > #define _PAGE_DIRTY             (_ULCAST_(1) << _PAGE_DIRTY_SHIFT)
> > > #define _PAGE_DIRTY_SHIFT       1
> >
> > _PAGE_DIRTY is a hardware writable bit (bad naming), meaning that mmu
> > allows write memory without any exception raised.
>
> (I just missed this email before I reply to the other one, I should have
>  read this one first..)
>
> I see. This surprises me a bit, as I can't quickly tell how it'll always
> work with the generic mm code.
>
> Say, is there a quick answer on why _PAGE_DIRTY is set here rather than
> pte_mkwrite()?  Because AFAIU that's where the mm wants to grant write
> permission to a page table entry as the API, no?
>
> >
> > >
> > > I don't see when write bit is set, which is bit 8 instead:
> > >
> > > #define _PAGE_WRITE             (_ULCAST_(1) << _PAGE_WRITE_SHIFT)
> > > #define _PAGE_WRITE_SHIFT       8
> >
> > _PAGE_WRITE is a software writable bit (not hardware).
> >
> > As David said, In __split_huge_pmd_locked, the VMA does not include VM_WRITE,
> >
> > entry = maybe_mkwrite(entry, vma);
> >
> > so the pte does not include software writable bit (_PAGE_WRITE).
>
> Are you sure?  In your test case you mapped with RW, IIUC it means even
> after the fork() VM_WRITE is set on both sides?

Sorry, I was wrong.

In this case, both VMAs are writable, the pte's writable bit is
cleared by pte_wrprotect. So if pte_mkdirty sets hardware writable bit
unconditionally, then there will be no way to catch writes to
implement COW.

I will try to explain how it works about pte write, dirty and
write-protect on LoongArch in the LoongArch mailing-list.

Regards,
Ray

>
> But I agree the write bit is not set, not because !VM_WRITE, but because we
> take care of that explicitly to make sure pte has the same write bit as pmd:
>
>                 (pmd used to be wr-protected due to fork())
>                 write = pmd_write(old_pmd);
>                 ...
>
>                 (then when split pte shouldn't have write bit too)
>                 if (!write)
>                         entry = pte_wrprotect(entry);
>
> >
> > and the dirty is true,
> >
> > if (dirty)
> >     entry = pte_mkdirty(entry);
> >
> > so the incorrect arch's pte_mkdirty set hardware writable
> > bit(_PAGE_DIRTY) in unconditional for read-only pages.
>
> True, that does also apply to sparc64 pte_mkdirty() with _PAGE_W_4[UV].  I
> should have noticed earlier that its comment told me that's a write bit
> already..
>
> #define _PAGE_W_4U        _AC(0x0000000000000002,UL) /* Writable             */
>
> Thanks,
>
> --
> Peter Xu
>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux