[Bug 201631] WARNING: CPU: 11 PID: 29593 at fs/ext4/inode.c:3927 .ext4_set_page_dirty+0x70/0xb0

bugzilla-daemon@xxxxxxxxxxxxxxxxxxx · Thu, 24 Jan 2019 08:15:52 +0000

https://bugzilla.kernel.org/show_bug.cgi?id=201631

--- Comment #51 from Jan Kara (jack@xxxxxxx) ---
(In reply to Aneesh Kumar KV from comment #50)
> (In reply to Jan Kara from comment #47)
> > OK, so it seems to be more and more clear that PPC indeed has some race in
> > page table updates. What I can see in the latest report is:
> > 
> > Clean page (index 92, ino 681741, i_size 828368, flags 7fff0000002016,
> > mapcount 1) with dirty PTE (pte_val c0000005f7fae186) on unmap! Vma flags
> > fb, pgoff 0, file ino 681741
> > ...
> > page 92: b_state 21, b_blocknr 2801084, b_mapped 1452389112002, b_mapped2
> 0,
> > b_cleaned 1452396217779, now 1452400395514
> > 
> > So "Vma flags fb" shows its a normal shared, writeable file mapping. Page
> is
> > somewhere in the middle of the file (file size is 828368, page is at offset
> > 376832). The page has been writeably mapped 11ms ago (you are using ext2
> > filesystem which was confusing my previous debug attempts so only this one
> > has shown proper times) and written back 4ms ago (which should have
> > writeprotected the pte) but we still have writeable pte now on which the
> > assertion hits. So either page_mkclean() failed to clear the PTE or someone
> > created new writeable PTE without telling ext4.
> > 
> > I'll attach a new version of debug patch to distinguish these two cases.
> 
> The fact that we did try to write out the page at (bh_cleaned
> 1452396217779)implies we should have cleared the _PAGE_WRITE bit right
> (clear_page_dirty_for_io())? 

Yes, clear_page_dirty_for_io() calls page_mkclean() which clears _PAGE_WRITE
bit. So at b_cleaned time there should be no writeable PTE.

> So we should either find that bit cleared in
> pte (if we missed a related tlb flush and tlb still has that pte with
> _PAGE_WRITE) or we find that set. In this case, we find _PAGE_WRITE set in
> the pte during zap. Does that imply we did call finish_fault()? which should
> have ideally resulted in we calling page_mkwrite().

The race is not clear to me either but the rule is that if you are creating
writeable PTE for a page, you must call ->page_mkwrite(). And from the debug
output page_mkclean() was called and no ->page_mkwrite() after that so there
should be no writeable PTE. But somehow there is one as zapping reports so we
need to find out who and when creates it without calling ->page_mkwrite(). New
version of my debug patch should tell us a bit more.

Note that there are other places that play with PTEs other than fault - like
page migration, mremap, mprotect, etc. All these seem to properly use PTE locks
to serialize with page_mkclean() but well... reality is what it is and there
must be bug somewhere :) After all there are close to 200 calls of set_pte_at()
in the kernel...

-- 
You are receiving this mail because:
You are watching the assignee of the bug.