Re: [PATCH v12 08/20] dax,ext2: Replace the XIP page fault handler with the DAX page fault handler

Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> · Tue, 13 Jan 2015 14:47:53 -0800

On Tue, 13 Jan 2015 16:53:34 -0500 Matthew Wilcox <willy@xxxxxxxxxxxxxxx> wrote:

> /*
>  * Lock ordering in mm:
>  *
>  * inode->i_mutex       (while writing or truncating, not reading or faulting)
>  *   mm->mmap_sem
> 
> > >  	   In the worst case, the file still has blocks
> > > +	 * allocated past the end of the file.
> > > +	 */
> > > +	size = (i_size_read(inode) + PAGE_SIZE - 1) >> PAGE_SHIFT;
> > > +	if (unlikely(vmf->pgoff >= size)) {
> > > +		error = -EIO;
> > > +		goto out;
> > > +	}
> > 
> > How does this play with holepunching?  Checking i_size won't work there?
> 
> It doesn't.  But the same problem exists with non-DAX files too, and
> when I pointed it out, it was met with a shrug from the crowd.  I saw a
> patch series just recently that fixes it for XFS, but as far as I know,
> btrfs and ext4 still don't play well with pagefault vs hole-punch races.

What are the user-visible effects of the race?

> > > +	memset(&bh, 0, sizeof(bh));
> > > +	block = (sector_t)vmf->pgoff << (PAGE_SHIFT - blkbits);
> > > +	bh.b_size = PAGE_SIZE;
> > 
> > ah, there.
> > 
> > PAGE_SIZE varies a lot between architectures.  What are the
> > implications of this>?
> 
> At the moment, you can only do DAX for blocksizes that are equal to
> PAGE_SIZE.  That's a restriction that existed for the previous XIP code,
> and I haven't fixed it all for DAX yet.  I'd like to, but it's not high on
> my list of things to fix.  Since these are in-mmeory filesystems, there's
> not likely to be high demand to move the filesystem between machines.

hm, I guess not.

This means that our users will need to mkfs their filesystems with
blocksize==pagesize.  The "error: unsupported blocksize for dax" printk
should get the message across, but a mention in
Documentation/filesystems/dax.txt's "Shortcomings" section wouldn't
hurt.

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html