Re: [PATCH 1/2] iomap: Do not unshare exents beyond EOF

Brian Foster <bfoster@xxxxxxxxxx> · Fri, 6 Sep 2024 15:11:58 -0400

On Thu, Sep 05, 2024 at 06:24:24PM +0800, Julian Sun wrote:
> Attempting to unshare extents beyond EOF will trigger
> the need zeroing case, which in turn triggers a warning.
> Therefore, let's skip the unshare process if extents are
> beyond EOF.
> 
> Reported-and-tested-by: syzbot+296b1c84b9cbf306e5a0@xxxxxxxxxxxxxxxxxxxxxxxxx
> Closes: https://syzkaller.appspot.com/bug?extid=296b1c84b9cbf306e5a0
> Fixes: 32a38a499104 ("iomap: use write_begin to read pages to unshare")
> Inspired-by: Dave Chinner <david@xxxxxxxxxxxxx>
> Signed-off-by: Julian Sun <sunjunchao2870@xxxxxxxxx>
> ---
>  fs/iomap/buffered-io.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index f420c53d86ac..8898d5ec606f 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -1340,6 +1340,9 @@ static loff_t iomap_unshare_iter(struct iomap_iter *iter)
>  	/* don't bother with holes or unwritten extents */
>  	if (srcmap->type == IOMAP_HOLE || srcmap->type == IOMAP_UNWRITTEN)
>  		return length;
> +	/* don't try to unshare any extents beyond EOF. */
> +	if (pos > i_size_read(iter->inode))
> +		return length;

Hi Julian,

What about if pos starts within EOF and the operation extends beyond it?
I ask because I think I've reproduced this scenario, though it is a bit
tricky and has dependencies...

For one, it seems to depend on the cowblocks patch I recently posted
here [1] (though I don't think this is necessarily a problem with the
patch, it just depends on keeping COW fork blocks around after the
unshare). With that, I reproduce via fsx with unshare range support [2]
using the ops file appended below [3] on a -bsize=1k XFS fs.

I haven't quite characterized the full sequence other than it looks like
the unshare walks across EOF with a shared data fork block and COW fork
delalloc, presumably finds the post-eof part of the folio !uptodate (so
iomap_adjust_read_range() doesn't skip it), and then trips over the
warning and error return associated with the folio zeroing in
__iomap_write_begin().

This all kind of has me wondering.. do we know the purpose of this
warning/error in the first place? It seems like it's more of an
"unexpected situation" than a specific problem. E.g., assuming the same
page were mmap'd, I _think_ the read fault path would do the same sort
of zeroing such that the unshare would see a fully uptodate folio and
carry on as normal. I added the mapread op to the opsfile below to give
that a quick test (remove the "skip" text to enable it), and it seems to
prevent the error, but I've not confirmed whether that theory is
actually what's happening.

FWIW, I also wonder if another way to handle this would be to just
restrict the range of iomap_file_unshare() to within EOF. IOW if a
caller passes a range beyond EOF, just process whatever part of the
range falls within EOF. It seems iomap isn't responsible for the file
extending aspect of the fallocate unshare command anyways.

Thoughts?

Brian

[1] https://lore.kernel.org/linux-xfs/20240906114051.120743-1-bfoster@xxxxxxxxxx/
[2] https://lore.kernel.org/fstests/20240906185606.136402-1-bfoster@xxxxxxxxxx/
[3] fsx ops file:

fallocate 0x3bc00 0x400 0
write 0x3bc00 0x800 0x3c000
clone_range 0x3bc00 0x400 0x0 0x3c400
skip mapread 0x3c000 0x400 0x3c400
fallocate 0x3bc00 0xc00 0x3c400 unshare