Re: Test case for "mm/thp: carry over dirty bit when thp splits on pmd"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 19.11.22 15:06, hev wrote:
Hi, Peter,

On Fri, Nov 18, 2022 at 2:29 AM Peter Xu <peterx@xxxxxxxxxx> wrote:

On Thu, Nov 17, 2022 at 10:29:57AM +0800, hev wrote:
Hi Peter,

Hi, Hev,


On Thu, Nov 17, 2022 at 12:25 AM Peter Xu <peterx@xxxxxxxxxx> wrote:

On Wed, Nov 16, 2022 at 01:45:15PM +0300, Anatoly Pugachev wrote:
On Wed, Nov 16, 2022 at 11:49 AM hev <r@xxxxxx> wrote:

Hello Peter,

Hi, Hev,

Thanks for letting me know.


I see a random crash issue  on the LoongArch system, that is caused by
commit 0ccf7f1 ("mm/thp: carry over dirty bit when thp splits on
pmd").

Now, the thing is already resolved. The root cause is arch's mkdirty
is set hardware writable bit in unconditional. That breaks
write-protect and then breaks COW.

Could you help explain how that happened?

I'm taking example of loongarch here:

static inline pte_t pte_mkdirty(pte_t pte)
{
         pte_val(pte) |= (_PAGE_DIRTY | _PAGE_MODIFIED);
         return pte;
}

#define _PAGE_MODIFIED          (_ULCAST_(1) << _PAGE_MODIFIED_SHIFT)
#define _PAGE_MODIFIED_SHIFT    9

_PAGE_MODIFIED is a software dirty bit

#define _PAGE_DIRTY             (_ULCAST_(1) << _PAGE_DIRTY_SHIFT)
#define _PAGE_DIRTY_SHIFT       1

_PAGE_DIRTY is a hardware writable bit (bad naming), meaning that mmu
allows write memory without any exception raised.

(I just missed this email before I reply to the other one, I should have
  read this one first..)

I see. This surprises me a bit, as I can't quickly tell how it'll always
work with the generic mm code.

Say, is there a quick answer on why _PAGE_DIRTY is set here rather than
pte_mkwrite()?  Because AFAIU that's where the mm wants to grant write
permission to a page table entry as the API, no?



I don't see when write bit is set, which is bit 8 instead:

#define _PAGE_WRITE             (_ULCAST_(1) << _PAGE_WRITE_SHIFT)
#define _PAGE_WRITE_SHIFT       8

_PAGE_WRITE is a software writable bit (not hardware).

As David said, In __split_huge_pmd_locked, the VMA does not include VM_WRITE,

entry = maybe_mkwrite(entry, vma);

so the pte does not include software writable bit (_PAGE_WRITE).

Are you sure?  In your test case you mapped with RW, IIUC it means even
after the fork() VM_WRITE is set on both sides?

Sorry, I was wrong.

In this case, both VMAs are writable, the pte's writable bit is
cleared by pte_wrprotect. So if pte_mkdirty sets hardware writable bit
unconditionally, then there will be no way to catch writes to
implement COW.

I will try to explain how it works about pte write, dirty and
write-protect on LoongArch in the LoongArch mailing-list.

Just to ask again,

is code like
	maybe_mkwrite(pte_mkdirty(entry), vma);

Like we have in copy_present_page(), wp_page_reuse(), wp_page_copy() ... broken on LoongArch of the VMA lacks VM_WRITE?

That would need *real* fixing, no hacks around that in other code areas.

--
Thanks,

David / dhildenb




[Index of Archives]     [Kernel Development]     [DCCP]     [Linux ARM Development]     [Linux]     [Photo]     [Yosemite Help]     [Linux ARM Kernel]     [Linux SCSI]     [Linux x86_64]     [Linux Hams]

  Powered by Linux