On Mon, Dec 02, 2013 at 08:58:25AM -0700, Matthew Wilcox wrote: > On Mon, Dec 02, 2013 at 09:33:18AM +0100, Jan Kara wrote: > > > Indeed, XIP could make use of the structures we already > > > have in the struct inode/address space to behave more like a normal > > > filesystem. We already use the mapping tree to store > > > per-page information that is used to serialise truncate vs page > > > faults, so why not make XIP do exactly the same thing? > > I believe that grabbing mmap_sem for writing during truncate (in case of > > ext2 around xip_truncate_page() & truncate_setsize() calls should do the > > trick. But I need to verify with lockdep that it doesn't introduce new > > locking problems. > > Umm ... mmap_sem for write? On each of the tasks that have a file mmaped? > I know we have double_down that orders by memory address, but I don't > think we have an N_down_write(). Here's a simpler solution: Grab i_mmap_sem over vm_insert_mixed(). i_mmap_sem is already taken in unmap_mapping_range(), called from truncate_pagecache(), so we know that while we hold i_mmap_sem that i_size has already been updated and the block can't be deallocated from under us. We know that i_mmap_sem is safe to take here because it already is (in __xip_unmap()). Now we need only worry about holepunch, and it's the same problem for pagecache and XIP files alike. diff --git a/mm/filemap_xip.c b/mm/filemap_xip.c index d8d9fe3..f6de4c3 100644 --- a/mm/filemap_xip.c +++ b/mm/filemap_xip.c @@ -260,8 +260,17 @@ again: __xip_unmap(mapping, vmf->pgoff); found: + /* We must recheck i_size under i_mmap_mutex */ + mutex_lock(&mapping->i_mmap_mutex); + size = (i_size_read(inode) + PAGE_CACHE_SIZE - 1) >> + PAGE_CACHE_SHIFT; + if (unlikely(vmf->pgoff >= size)) { + mutex_unlock(&mapping->i_mmap_mutex); + return VM_FAULT_SIGBUS; + } err = vm_insert_mixed(vma, (unsigned long)vmf->virtual_address, xip_pfn); + mutex_unlock(&mapping->i_mmap_mutex); if (err == -ENOMEM) return VM_FAULT_OOM; /* @@ -285,6 +294,15 @@ found: } if (error != -ENODATA) goto out; + + /* We must recheck i_size under i_mmap_mutex */ + mutex_lock(&mapping->i_mmap_mutex); + size = (i_size_read(inode) + PAGE_CACHE_SIZE - 1) >> + PAGE_CACHE_SHIFT; + if (unlikely(vmf->pgoff >= size)) { + ret = VM_FAULT_SIGBUS; + goto out; + } /* not shared and writable, use xip_sparse_page() */ page = xip_sparse_page(); if (!page) @@ -296,6 +314,7 @@ found: ret = VM_FAULT_NOPAGE; out: + mutex_unlock(&mapping->i_mmap_mutex); write_seqcount_end(&xip_sparse_seq); mutex_unlock(&xip_sparse_mutex); -- Matthew Wilcox Intel Open Source Technology Centre "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html