On Fri 13-12-19 15:28:32, lixinhai.lxh@xxxxxxxxx wrote: > On 2019-12-13 at 00:55 Kirill A. Shutemov wrote: > >On Thu, Dec 12, 2019 at 11:21:18PM +0800, Li Xinhai wrote: > >> When vm_ops->page_mkwrite is defined, and called from wp_page_shared and > >> do_shared_fault, the set_page_dirty must already called by page_mkwrite. > > > >Must? Do all ->page_mkwrite implementation do this? > > My understanding is that set_page_dirty need be called before PTE is set > to allow writing. If not in this sequence, other thread will see a > writable PTE and dirty the page before current thread set_page_dirty. Yes, filesystems effectively do rely on this. > In ->page_mkwrite, FS can decide if set_page_dirty should be called or > not. I checked a few FS, ext4/xfs/btrsfs/ceph and generic > filemap_page_mkwrite, they called it. If FS provide ->page_mkwrite and > decide don't call set_page_dirty, why fault_dirty_shared_page call this > function unconditionally? or, I missed something? Well, generally the responsibility for dirtying the page has been on the generic MM code (i.e., fault_dirty_shared_page()). Now you're right that lots of filesystems will end up dirtying the page because they are reusing generic helpers for handling ->page_mkwrite() and that happens to dirty the page. But that is mostly a coincidence and not guarantee. So to safely remove page dirtying from fault_dirty_shared_page(), you'd need to audit *all* ->page_mkwrite() implementations, make sure they all dirty the page, and then document this requirement somewhere. Overall I don't think the effort is really worth it since redirtying already dirty page is very cheap. Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR