https://bugzilla.kernel.org/show_bug.cgi?id=201631 --- Comment #51 from Jan Kara (jack@xxxxxxx) --- (In reply to Aneesh Kumar KV from comment #50) > (In reply to Jan Kara from comment #47) > > OK, so it seems to be more and more clear that PPC indeed has some race in > > page table updates. What I can see in the latest report is: > > > > Clean page (index 92, ino 681741, i_size 828368, flags 7fff0000002016, > > mapcount 1) with dirty PTE (pte_val c0000005f7fae186) on unmap! Vma flags > > fb, pgoff 0, file ino 681741 > > ... > > page 92: b_state 21, b_blocknr 2801084, b_mapped 1452389112002, b_mapped2 > 0, > > b_cleaned 1452396217779, now 1452400395514 > > > > So "Vma flags fb" shows its a normal shared, writeable file mapping. Page > is > > somewhere in the middle of the file (file size is 828368, page is at offset > > 376832). The page has been writeably mapped 11ms ago (you are using ext2 > > filesystem which was confusing my previous debug attempts so only this one > > has shown proper times) and written back 4ms ago (which should have > > writeprotected the pte) but we still have writeable pte now on which the > > assertion hits. So either page_mkclean() failed to clear the PTE or someone > > created new writeable PTE without telling ext4. > > > > I'll attach a new version of debug patch to distinguish these two cases. > > The fact that we did try to write out the page at (bh_cleaned > 1452396217779)implies we should have cleared the _PAGE_WRITE bit right > (clear_page_dirty_for_io())? Yes, clear_page_dirty_for_io() calls page_mkclean() which clears _PAGE_WRITE bit. So at b_cleaned time there should be no writeable PTE. > So we should either find that bit cleared in > pte (if we missed a related tlb flush and tlb still has that pte with > _PAGE_WRITE) or we find that set. In this case, we find _PAGE_WRITE set in > the pte during zap. Does that imply we did call finish_fault()? which should > have ideally resulted in we calling page_mkwrite(). The race is not clear to me either but the rule is that if you are creating writeable PTE for a page, you must call ->page_mkwrite(). And from the debug output page_mkclean() was called and no ->page_mkwrite() after that so there should be no writeable PTE. But somehow there is one as zapping reports so we need to find out who and when creates it without calling ->page_mkwrite(). New version of my debug patch should tell us a bit more. Note that there are other places that play with PTEs other than fault - like page migration, mremap, mprotect, etc. All these seem to properly use PTE locks to serialize with page_mkclean() but well... reality is what it is and there must be bug somewhere :) After all there are close to 200 calls of set_pte_at() in the kernel... -- You are receiving this mail because: You are watching the assignee of the bug.