Re: Btrfs more than twice as fast compared to ext4

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]



On 16/03/10 00:48, Shridhar Daithankar wrote:
[...]
But as far as file system performance goes, the overhead should be identical
for both the runs, no?

I'm not too sure about that. I'm guessing there is less seeking going on with Btrfs. Some files systems (reiserfs + reiserfs4 IIRC) are very good with many small files, better than the ext*fs, this may be another case of that.

Besides, I need to run the comparison(rather verification of file contents)
many times over during the application life-cycle and I cannot afford to bring
in another copy from disk. The working set is expected to be 30-40GB at a
time, 3GB is just test setup.

With md5sum, I can store it in database and verify it on one copy only.

Fair enough.

And finally, it is terrible on timings. Running md5sum is lot faster, about 3
times in the best case.
 [...]
wow, that's slow!

So when the source file system is btrfs, it is still couple of times faster at
least.
I still think you could achieve better times by not calling the external command that many times. Since you're already gonna store the checksums in a database, I'd just write a proper program in python or something.

Or even just a shellscript, but you might wanna refrain from for .. in `find .. , it's the slowest and that relies on the fact that your filenames don't have spaces in them.

[[ky] ~]# }} time find /usr/bin -type f -print0 | xargs -0 md5sum > /tmp/1
real	0m3.633s

[[ky] ~]# }} time find /usr/bin -type f -exec md5sum "{}" \; > /tmp/2
real	0m10.196s
[[ky] ~]# }} time for i in `find /usr/bin -type f`;do md5sum "$i";done > /tmp/3
real	0m11.245s

this last version missed a file because it has spaces in its name and as result the file 3 was inconsistent with files 1 and 2

[[ky] ~]# }} diff /tmp/{1,2}
[[ky] ~]# }} diff /tmp/{3,2}
3054a3055
> 0c5d8f10aa0731671a00961f059dc46e /usr/bin/New SMB and DCERPC features in Impacket.pdf

that was a test against just 4008, so you can imagine time savings with 50000+ files.


[Index of Archives]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Share Photos]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Device Mapper]
  Powered by Linux