On 2024/8/14 9:49, Dave Chinner wrote: > On Mon, Aug 12, 2024 at 08:11:53PM +0800, Zhang Yi wrote: >> From: Zhang Yi <yi.zhang@xxxxxxxxxx> >> >> Changes since v1: >> - Patch 5 fix a stale data exposure problem pointed out by Willy, drop >> the setting of uptodate bits after zeroing out unaligned range. >> - As Dave suggested, in order to prevent increasing the complexity of >> maintain the state_lock, don't just drop all the state_lock in the >> buffered write path, patch 6 introduce a new helper to set uptodate >> bit and dirty bits together under the state_lock, reduce one time of >> locking per write, the benefits of performance optimization do not >> change too much. > > It's helpful to provide a lore link to the previous version so that > reviewers don't have to go looking for it themselves to remind them > of what was discussed last time. > > https://lore.kernel.org/linux-xfs/20240731091305.2896873-1-yi.zhang@xxxxxxxxxxxxxxx/T/ Sure, will add in my later iterations. > >> This series contains some minor non-critical fixes and performance >> improvements on the filesystem with block size < folio size. >> >> The first 4 patches fix the handling of setting and clearing folio ifs >> dirty bits when mark the folio dirty and when invalidat the folio. >> Although none of these code mistakes caused a real problem now, it's >> still deserve a fix to correct the behavior. >> >> The second 2 patches drop the unnecessary state_lock in ifs when setting >> and clearing dirty/uptodate bits in the buffered write path, it could >> improve some (~8% on my machine) buffer write performance. I tested it >> through UnixBench on my x86_64 (Xeon Gold 6151) and arm64 (Kunpeng-920) >> virtual machine with 50GB ramdisk and xfs filesystem, the results shows >> below. >> >> UnixBench test cmd: >> ./Run -i 1 -c 1 fstime-w >> >> Before: >> x86 File Write 1024 bufsize 2000 maxblocks 524708.0 KBps >> arm64 File Write 1024 bufsize 2000 maxblocks 801965.0 KBps >> >> After: >> x86 File Write 1024 bufsize 2000 maxblocks 569218.0 KBps >> arm64 File Write 1024 bufsize 2000 maxblocks 871605.0 KBps > > Those are the same performance numbers as you posted for the > previous version of the patch. How does this new version perform > given that it's a complete rework of the optimisation? It's It's not exactly the same, but the difference is small, I've updated the performance number in this cover letter. > important to know if the changes made actually provided the benefit > we expected them to make.... > > i.e. this is the sort of table of results I'd like to see provided: > > platform base v1 v2 > x86 524708.0 569218.0 ???? > arm64 801965.0 871605.0 ???? > platform base v1 v2 x86 524708.0 571315.0 569218.0 arm64 801965.0 876077.0 871605.0 Thanks, Yi.