Re: [PATCH v2 6/6] iomap: reduce unnecessary state_lock when setting ifs uptodate and dirty bits

Zhang Yi <yi.zhang@xxxxxxxxxxxxxxx> · Tue, 13 Aug 2024 16:15:35 +0800

On 2024/8/13 1:00, Matthew Wilcox wrote:
> On Mon, Aug 12, 2024 at 08:11:59PM +0800, Zhang Yi wrote:
>> @@ -866,9 +899,8 @@ static bool __iomap_write_end(struct inode *inode, loff_t pos, size_t len,
>>  	 */
>>  	if (unlikely(copied < len && !folio_test_uptodate(folio)))
>>  		return false;
>> -	iomap_set_range_uptodate(folio, offset_in_folio(folio, pos), len);
>> -	iomap_set_range_dirty(folio, offset_in_folio(folio, pos), copied);
>> -	filemap_dirty_folio(inode->i_mapping, folio);
>> +
>> +	iomap_set_range_dirty_uptodate(folio, from, copied);
>>  	return true;
> 
> I wonder how often we overwrite a completely uptodate folio rather than
> writing new data to a fresh folio?  iow, would this be a measurable
> optimisation?
> 
> 	if (folio_test_uptodate(folio))
> 		iomap_set_range_dirty(folio, from, copied);
> 	else
> 		iomap_set_range_dirty_uptodate(folio, from, copied);
> 

Thanks for the suggestion, I'm not sure how often as well, but I suppose
we could do this optimisation since I've tested it and found this is
harmless for the case of writing new data to a fresh folio. However, this
can further improves the overwrite performance, the UnixBench tests result
shows the performance gain can be increased to about ~15% on my machine
with 50GB ramdisk and xfs filesystem.

UnixBench test cmd:
 ./Run -i 1 -c 1 fstime-w

Base:
x86    File Write 1024 bufsize 2000 maxblocks       524708.0 KBps
arm64  File Write 1024 bufsize 2000 maxblocks       801965.0 KBps

After this series:
x86    File Write 1024 bufsize 2000 maxblocks       569218.0 KBps
arm64  File Write 1024 bufsize 2000 maxblocks       871605.0 KBps

After this measurable optimisation:
x86    File Write 1024 bufsize 2000 maxblocks       609620.0 KBps
arm64  File Write 1024 bufsize 2000 maxblocks       910534.0 KBps

Thanks,
Yi.