Search Postgresql Archives

Re: cyclical redundancy checksum algorithm(s)?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



You sure that's actually what he said?  A change in CRC proves the data
changed, but lack of a change does not prove it didn't.

"To quickly determine if rows have changed, we rely on a cyclic
redundancy checksum (CRC) algorithm.   If the CRC is identical for the
>
"summary" functions, such as an MD5 hash.  I wouldn't trust it at all
with a 32-bit CRC, and not much with a 64-bit CRC.  Too much risk of
collision.

Small example of collisions for crc32:
0x38ee5531
        Hundemarke
        92294
0x59471e4f
        raciner
        tranchefiler
0x947bb6c0
        Betriebsteile
        4245


I had make a lot of work when choosing hash function for tsearch2. Also, I had find that popular hash algorithms produce more collision for non-ascii languages... CRC32 is more "smooth". On dictionary with 332296 unique words CRC32 produces 11 collisions, perl's hash function - 35, pgsql's hash_any - 12.

--
Teodor Sigaev                                   E-mail: teodor@xxxxxxxxx
                                                   WWW: http://www.sigaev.ru/


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux