On Wed 28-03-12 09:45:53, Dave Chinner wrote: > On Wed, Mar 28, 2012 at 12:08:19AM +0200, Jan Kara wrote: > > On Tue 27-03-12 14:38:15, Andrew Morton wrote: > > > On Tue, 27 Mar 2012 09:55:27 +0200 > > > Jan Kara <jack@xxxxxxx> wrote: > > > > On Fri 23-03-12 15:45:02, Andrew Morton wrote: > > > > > On Mon, 5 Mar 2012 17:00:59 +0100 > > > > > Jan Kara <jack@xxxxxxx> wrote: > > > > > > > > > > > --- a/mm/filemap.c > > > > > > +++ b/mm/filemap.c > > > > > > @@ -1759,8 +1759,28 @@ page_not_uptodate: > > > > > > } > > > > > > EXPORT_SYMBOL(filemap_fault); > > > > > > > > > > > > +int filemap_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf) > > > > > > +{ > > > > > > + struct page *page = vmf->page; > > > > > > + struct inode *inode = vma->vm_file->f_path.dentry->d_inode; > > > > > > + int ret = VM_FAULT_LOCKED; > > > > > > + > > > > > > + file_update_time(vma->vm_file); > > > > > > + lock_page(page); > > > > > > + if ((page->mapping != inode->i_mapping) || > > > > > > + (page_offset(page) > i_size_read(inode))) { > > > > > > > > > > Would benefit from a comment explaining how the page can come to be > > > > > outside i_size, and why we fail in that case. > > > > > > This? > > > > > > > > I don't think i_mutex is held here, so this test is rather meaningless > > > > > and racy anyway? > > > > i_size test is racy if that's what you mean by "this test". Just I did > > > > the test this way because it's like this in other places and I figured > > > > truncate_pagecache() can take relatively long time so the test has some > > > > effect. But if you think it's not worth it, I can remove it. > > > > > > It bugs me when we copy-n-paste code without remembering why we had it > > > there in the first place :( iirc, mmapped pages outside i_size can and > > > do happen in some race situations, and are benign. > > Yeah. Certainly there can be pages beyond i_size because we first set > > file size and then go and remove pages beyond new i_size one by one when we > > do truncate. We must be careful not to create any new pages beyond i_size > > but that's what filemap_fault() takes care of. So I think i_size check in > > ->page_mkwrite() isn't strictly needed. > > Actually, I think it is. In __do_fault(), we drop the page lock > between the .fault call and the .page_mkwrite() call, so the size > checks in .fault for the given page being faulted are no longer > valid as truncate serialises only on the page lock. Hence we have to > repeat the truncate race checks again in .page_mkwrite() after we > relock the page. I was thinking about this scenario as well but I think doesn't cause any problems. But maybe there's some noise in what the condition actually is. So I currently have: if (page->mapping != inode->i_mapping) bail.. do stuff. And I think that is enough. Because only filemap_fault() creates new page, ->page_mkwrite() only uses the page reference from vmf->page. So the only thing that can happen after __do_fault() drops page lock is that truncate_pagecache() will go and truncate the page (thus page->mapping will be NULL) and the condition I currently have should be enough to catch that. Honza -- Jan Kara <jack@xxxxxxx> SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html