On Fri, Apr 1, 2016 at 7:23 PM, Allen Samuels <Allen.Samuels@xxxxxxxxxxx> wrote: > Talk about mental failures. The first statement is correct. It's about the ratio of checksum to data bits. After that please ignore. If you double the data you need to double the checksum bit to maintain the ber. Forgive me if I'm wrong here — I haven't done anything with checksumming since I graduated college — but good checksumming is about probabilities and people suck at evaluating probability: I'm really not sure any of the explanations given in this thread are right. Bit errors aren't random and in general it requires a lot more than one bit flip to collide a checksum, so I don't think it's a linear relationship between block size and chance of error. Finding collisions with cryptographic hashes is hard! Granted a CRC is a lot simpler than SHA1 or whatever, but we also aren't facing adversaries with it, just random corruption. So yes, as your data block increases then naturally the number of possible bit patterns which match the same CRC have to increase — but that doesn't mean your odds of actually *getting* that bit pattern by mistake increase linearly. I spent a brief time trying to read up on Hamming distances and "minimum distance separable" to try and remember/understand this and it's just making my head hurt, so hopefully somebody with the right math background can chime in. -Greg -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html