Re: [PATCH] vfs: Optimize dedupe comparison

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 15.07.21 г. 17:30, Matthew Wilcox wrote:
> On Thu, Jul 15, 2021 at 05:13:09PM +0300, Nikolay Borisov wrote:
>> Currently the comparison method vfs_dedupe_file_range_compare utilizes
>> is a plain memcmp. This effectively means the code is doing byte-by-byte
>> comparison. Instead, the code could do word-sized comparison without
>> adverse effect on performance, provided that the comparison's length is
>> at least as big as the native word size, as well as resulting memory
>> addresses are properly aligned.
> 
> Sounds to me like somebody hasn't optimised memcmp() very well ...
> is this x86-64?
> 

That was my first impression, here's the profile:

       │    Disassembly of section .text:
       │
       │    ffffffff815c6f60 <memcmp>:
       │    memcmp():
       │      test   %rdx,%rdx
       │    ↓ je     22
       │      xor    %ecx,%ecx
       │    ↓ jmp    12
 49.32 │ 9:   add    $0x1,%rcx
  0.03 │      cmp    %rcx,%rdx
 11.82 │    ↓ je     21
  0.01 │12:   movzbl (%rdi,%rcx,1),%eax
 38.19 │      movzbl (%rsi,%rcx,1),%r8d
  0.59 │      sub    %r8d,%eax
  0.04 │    ↑ je     9
       │    ← retq
       │21: ← retq
       │22:   xor    %eax,%eax
       │    ← retq


It's indeed on x86-64 and according to the sources it's using
__builtin_memcmp according to arch/x86/boot/string.h

>> @@ -256,9 +257,35 @@ static int vfs_dedupe_file_range_compare(struct inode *src, loff_t srcoff,
>>  		flush_dcache_page(src_page);
>>  		flush_dcache_page(dest_page);
>>
>> -		if (memcmp(src_addr + src_poff, dest_addr + dest_poff, cmp_len))
>> -			same = false;
>>
>> +		if (!IS_ALIGNED((unsigned long)(src_addr + src_poff), block_size) ||
>> +		    !IS_ALIGNED((unsigned long)(dest_addr + dest_poff), block_size) ||
>> +		    cmp_len < block_size) {
> 
> Can this even happen?  Surely we can only dedup on a block boundary and
> blocks are required to be a power of two and at least 512 bytes in size?

I was wondering the same thing, but AFAICS it seems to be possible i.e
if userspace spaces bad offsets, while all kinds of internal fs
synchronization ops are going to be performed on aligned offsets, that
doesn't mean the original ones, passed from userspace are themselves
aligned explicitly.
> 
>> +			if (memcmp(src_addr + src_poff, dest_addr + dest_poff,
>> +				   cmp_len))
>> +				same = false;
>> +		} else {
>> +			int i;
>> +			size_t blocks = cmp_len / block_size;
>> +			loff_t rem_len = cmp_len - (blocks * block_size);
>> +			unsigned long *src = src_addr + src_poff;
>> +			unsigned long *dst = dest_addr + src_poff;
>> +
>> +			for (i = 0; i < blocks; i++) {
>> +				if (src[i] - dst[i]) {
>> +					same = false;
>> +					goto finished;
>> +				}
>> +			}
>> +
>> +			if (rem_len) {
>> +				src_addr += src_poff + (blocks * block_size);
>> +				dest_addr += dest_poff + (blocks * block_size);
>> +				if (memcmp(src_addr, dest_addr, rem_len))
>> +					same = false;
>> +			}
>> +		}
>> +finished:
>>  		kunmap_atomic(dest_addr);
>>  		kunmap_atomic(src_addr);
>>  unlock:
>> --
>> 2.25.1
>>
> 



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux