Re: Which hashing algorithm is best to check file duplicity?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




"Fastest" depends mostly on the size of the file, not the algorithm used. A 2gig file will take a while using md5 as it will using sha1.

Using md5 will be slightly quicker than sha1 because generates a shorter hash so the trade-off is up to you.

$ ls -lh file.gz

724M 2008-07-28 10:02 file.gz

$ time sha1sum file.gz
4ae7bd1e79088a3e3849e17c7be989d4a7c97450  file.gz

real    0m3.398s
user    0m3.056s
sys    0m0.336s

$ time md5sum file.gz
16cff7b95bcb5971daf1cabee6ca4edd  file.gz

real    0m2.091s
user    0m1.744s
sys    0m0.328s

$ time sha1sum file.gz
4ae7bd1e79088a3e3849e17c7be989d4a7c97450  file.gz

real    0m3.332s
user    0m2.988s
sys    0m0.344s

$ time md5sum file.gz
16cff7b95bcb5971daf1cabee6ca4edd  file.gz

real    0m2.136s
user    0m1.776s
sys    0m0.348s
Aha, thanks for sharing the benchmark. I'll go with MD5()

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux