Re: [RFC PATCH 12/18] iomap: don't increase i_size if it's not a write operation

Zhang Yi <yi.zhang@xxxxxxxxxxxxxxx> · Thu, 30 Nov 2023 20:26:28 +0800

On 2023/11/23 23:34, Christoph Hellwig wrote:
> On Thu, Nov 23, 2023 at 08:51:14PM +0800, Zhang Yi wrote:
>> index fd4d43bafd1b..3b9ba390dd1b 100644
>> --- a/fs/iomap/buffered-io.c
>> +++ b/fs/iomap/buffered-io.c
>> @@ -852,13 +852,13 @@ static size_t iomap_write_end(struct iomap_iter *iter, loff_t pos, size_t len,
>>  	 * cache.  It's up to the file system to write the updated size to disk,
>>  	 * preferably after I/O completion so that no stale data is exposed.
>>  	 */
>> -	if (pos + ret > old_size) {
>> +	if ((iter->flags & IOMAP_WRITE) && pos + ret > old_size) {
>>  		i_size_write(iter->inode, pos + ret);
>>  		iter->iomap.flags |= IOMAP_F_SIZE_CHANGED;
>>  	}
>>  	__iomap_put_folio(iter, pos, ret, folio);
>>  
>> -	if (old_size < pos)
>> +	if ((iter->flags & IOMAP_WRITE) && old_size < pos)
>>  		pagecache_isize_extended(iter->inode, old_size, pos);
>>  	if (ret < len)
>>  		iomap_write_failed(iter->inode, pos + ret, len - ret);
> 
> I agree with your rationale, but I hate how this code ends up
> looking.  In many ways iomap_write_end seems like the wrong
> place to update the inode size anyway.  I've not done a deep
> analysis, but I think there shouldn't really be any major blocker
> to only setting IOMAP_F_SIZE_CHANGED in iomap_write_end, and then
> move updating i_size and calling pagecache_isize_extended to
> iomap_write_iter.
> 

Think about it in depth, I think we cannot move updating i_size
to iomap_write_iter() because we have to do this under folio lock,
otherwise, once we unlock folio, the writeback process could start
writing back and call folio_zero_segment() to zero out the valid
data beyond the unupdated i_size. Only if we move
__iomap_put_folio() out together, but I suppose it's not a good
way.

Thanks,
Yi.