On 2024/8/13 1:00, Matthew Wilcox wrote: > On Mon, Aug 12, 2024 at 08:11:59PM +0800, Zhang Yi wrote: >> @@ -866,9 +899,8 @@ static bool __iomap_write_end(struct inode *inode, loff_t pos, size_t len, >> */ >> if (unlikely(copied < len && !folio_test_uptodate(folio))) >> return false; >> - iomap_set_range_uptodate(folio, offset_in_folio(folio, pos), len); >> - iomap_set_range_dirty(folio, offset_in_folio(folio, pos), copied); >> - filemap_dirty_folio(inode->i_mapping, folio); >> + >> + iomap_set_range_dirty_uptodate(folio, from, copied); >> return true; > > I wonder how often we overwrite a completely uptodate folio rather than > writing new data to a fresh folio? iow, would this be a measurable > optimisation? > > if (folio_test_uptodate(folio)) > iomap_set_range_dirty(folio, from, copied); > else > iomap_set_range_dirty_uptodate(folio, from, copied); > Thanks for the suggestion, I'm not sure how often as well, but I suppose we could do this optimisation since I've tested it and found this is harmless for the case of writing new data to a fresh folio. However, this can further improves the overwrite performance, the UnixBench tests result shows the performance gain can be increased to about ~15% on my machine with 50GB ramdisk and xfs filesystem. UnixBench test cmd: ./Run -i 1 -c 1 fstime-w Base: x86 File Write 1024 bufsize 2000 maxblocks 524708.0 KBps arm64 File Write 1024 bufsize 2000 maxblocks 801965.0 KBps After this series: x86 File Write 1024 bufsize 2000 maxblocks 569218.0 KBps arm64 File Write 1024 bufsize 2000 maxblocks 871605.0 KBps After this measurable optimisation: x86 File Write 1024 bufsize 2000 maxblocks 609620.0 KBps arm64 File Write 1024 bufsize 2000 maxblocks 910534.0 KBps Thanks, Yi.