Re: [PATCH 2/5] ext4: move inode extension/truncate code out from ext4_iomap_end()

Matthew Bobrowski <mbobrowski@xxxxxxxxxxxxxx> · Thu, 29 Aug 2019 07:54:21 +1000

On Wed, Aug 28, 2019 at 09:59:14PM +0200, Jan Kara wrote:
> On Mon 12-08-19 22:52:53, Matthew Bobrowski wrote:
> > +static int ext4_handle_inode_extension(struct inode *inode, loff_t size,
> > +				       size_t count)
> > +{
> > +	handle_t *handle;
> > +	bool truncate = false;
> > +	ext4_lblk_t written_blk, end_blk;
> > +	int ret = 0, blkbits = inode->i_blkbits;
> > +
> > +	handle = ext4_journal_start(inode, EXT4_HT_INODE, 2);
> > +	if (IS_ERR(handle)) {
> > +		ret = PTR_ERR(handle);
> > +		goto orphan_del;
> > +	}
> > +
> > +	if (ext4_update_inode_size(inode, size))
> > +		ext4_mark_inode_dirty(handle, inode);
> > +
> > +	/*
> > +	 * We may need truncate allocated but not written blocks
> > +	 * beyond EOF.
> > +	 */
> > +	written_blk = ALIGN(size, 1 << blkbits);
> > +	end_blk = ALIGN(size + count, 1 << blkbits);
> 
> So this seems to imply that 'size' is really offset where IO started but
> ext4_update_inode_size(inode, size) above suggests 'size' is really where
> IO has ended and that's indeed what you pass into
> ext4_handle_inode_extension(). So I'd just make the calling convention for
> ext4_handle_inode_extension() less confusing and pass 'offset' and 'len'
> and fixup the math inside the function...

Agree, that will help with making things more transparent.

Also, one other thing while looking at this patch again. See comment
below.

> > @@ -257,6 +308,13 @@ ext4_dax_write_iter(struct kiocb *iocb, struct iov_iter *from)
> >  		goto out;
> >  
> >  	ret = dax_iomap_rw(iocb, from, &ext4_iomap_ops);
> > +
> > +	if (ret > 0 && iocb->ki_pos > i_size_read(inode)) {
> > +		err = ext4_handle_inode_extension(inode, iocb->ki_pos,
> > +						  iov_iter_count(from));
> > +		if (err)
> > +			ret = err;
> > +	}

I noticed that within ext4_dax_write_iter() we're not accommodating
for error cases. Subsequently, there's no clean up code that goes with
that. So, for example, in the case where we've added the inode onto
the orphan list as a result of an extension and we bump into any error
i.e. -ENOSPC, we'll be left with inconsistencies. Perhaps it might be
worthwhile to introduce a helper here
i.e. ext4_dax_handle_failed_write(), which would be the path taken to
perform any necessary clean up routines in the case of a failed write?
I think it'd be better to have this rather than polluting
ext4_dax_write_iter(). What do you think?

--M