Re: [PATCH v2 0/6] iomap: some minor non-critical fixes and improvements when block size < folio size

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2024/8/14 9:49, Dave Chinner wrote:
> On Mon, Aug 12, 2024 at 08:11:53PM +0800, Zhang Yi wrote:
>> From: Zhang Yi <yi.zhang@xxxxxxxxxx>
>>
>> Changes since v1:
>>  - Patch 5 fix a stale data exposure problem pointed out by Willy, drop
>>    the setting of uptodate bits after zeroing out unaligned range.
>>  - As Dave suggested, in order to prevent increasing the complexity of
>>    maintain the state_lock, don't just drop all the state_lock in the
>>    buffered write path, patch 6 introduce a new helper to set uptodate
>>    bit and dirty bits together under the state_lock, reduce one time of
>>    locking per write, the benefits of performance optimization do not
>>    change too much.
> 
> It's helpful to provide a lore link to the previous version so that
> reviewers don't have to go looking for it themselves to remind them
> of what was discussed last time.
> 
> https://lore.kernel.org/linux-xfs/20240731091305.2896873-1-yi.zhang@xxxxxxxxxxxxxxx/T/

Sure, will add in my later iterations.

> 
>> This series contains some minor non-critical fixes and performance
>> improvements on the filesystem with block size < folio size.
>>
>> The first 4 patches fix the handling of setting and clearing folio ifs
>> dirty bits when mark the folio dirty and when invalidat the folio.
>> Although none of these code mistakes caused a real problem now, it's
>> still deserve a fix to correct the behavior.
>>
>> The second 2 patches drop the unnecessary state_lock in ifs when setting
>> and clearing dirty/uptodate bits in the buffered write path, it could
>> improve some (~8% on my machine) buffer write performance. I tested it
>> through UnixBench on my x86_64 (Xeon Gold 6151) and arm64 (Kunpeng-920)
>> virtual machine with 50GB ramdisk and xfs filesystem, the results shows
>> below.
>>
>> UnixBench test cmd:
>>  ./Run -i 1 -c 1 fstime-w
>>
>> Before:
>> x86    File Write 1024 bufsize 2000 maxblocks       524708.0 KBps
>> arm64  File Write 1024 bufsize 2000 maxblocks       801965.0 KBps
>>
>> After:
>> x86    File Write 1024 bufsize 2000 maxblocks       569218.0 KBps
>> arm64  File Write 1024 bufsize 2000 maxblocks       871605.0 KBps
> 
> Those are the same performance numbers as you posted for the
> previous version of the patch. How does this new version perform
> given that it's a complete rework of the optimisation? It's

It's not exactly the same, but the difference is small, I've updated
the performance number in this cover letter.

> important to know if the changes made actually provided the benefit
> we expected them to make....
> 
> i.e. this is the sort of table of results I'd like to see provided:
> 
> platform	base		v1		v2
> x86		524708.0	569218.0	????
> arm64		801965.0	871605.0	????
> 

 platform	base		v1		v2
 x86		524708.0	571315.0 	569218.0
 arm64		801965.0	876077.0	871605.0

Thanks,
Yi.





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux