On Thu, 2009-04-23 at 21:52 +0200, Miklos Szeredi wrote: > On Thu, 23 Apr 2009, Trond Myklebust wrote: > > I'm still working on the bug in > > http://bugzilla.kernel.org/show_bug.cgi?id=12913 . One other source of > > grief appears to be munmap(), which is calling set_page_dirty() on a > > number of pages without locking them or first calling page_mkwrite(). > > > > Currently, this means that we either ignore that dirty bit (since > > nfs_page_async_flush() won't find a corresponding write request) or it > > too can end up triggering the PG_CLEAN BUG() in fs/nfs/write.c:252 if > > the timing is right. > > > > So what is the reason why zap_pte_range() calls set_page_dirty() > > directly? > > In the old times this was one of the main ways of transferring the pte > dirtyness to the PG_dirty page flag. > > Now this is mostly done at page fault time, and the pte's are always > being re-protected whenever the PG_dirty flag is cleared (see > page_mkclean()). > > But in some cases (shmfs being the example I know) pages are not write > protected and so zap_pte_range(), and other functions, still need to > transfer the pte dirtyness to the page flag. My main worry is that this is all happening at munmap() time. There shouldn't be any more page faults after that completes (am I right?), so what other mechanism would transfer the pte dirtyness? > Not sure how this matters to NFS though. If the above is correct, > then the set_page_dirty() call in zap_pte_range() should always result > in a no-op, since the PG_dirty flag would already have been set by the > page fault... If I can ignore the dirty flag on these occasions, then that would be great. That would enable me to get rid of that BUG_ON(PG_CLEAN) in write.c, and close the bug... Cheers Trond -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html