> -----Original Message----- > From: Chris Dunlop [mailto:chris@xxxxxxxxxxxx] > Sent: Friday, April 01, 2016 10:05 PM > To: Gregory Farnum <gfarnum@xxxxxxxxxx> > Cc: Allen Samuels <Allen.Samuels@xxxxxxxxxxx>; Sage Weil > <sage@xxxxxxxxxxxx>; Igor Fedotov <ifedotov@xxxxxxxxxxxx>; ceph- > devel <ceph-devel@xxxxxxxxxxxxxxx> > Subject: Re: Adding compression/checksum support for bluestore. > > On Fri, Apr 01, 2016 at 07:51:07PM -0700, Gregory Farnum wrote: > > On Fri, Apr 1, 2016 at 7:23 PM, Allen Samuels > <Allen.Samuels@xxxxxxxxxxx> wrote: > >> Talk about mental failures. The first statement is correct. It's about the > ratio of checksum to data bits. After that please ignore. If you double the > data you need to double the checksum bit to maintain the ber. > > > > Forgive me if I'm wrong here — I haven't done anything with > > checksumming since I graduated college — but good checksumming is > > about probabilities and people suck at evaluating probability: I'm > > really not sure any of the explanations given in this thread are > > right. Bit errors aren't random and in general it requires a lot more > > than one bit flip to collide a checksum, so I don't think it's a > > linear relationship between block size and chance of error. Finding > > A single bit flip can certainly result in a checksum collision, with the same > chance as any other error, i.e. 1 in 2^number_of_checksum_bits. > > Just to clarify: the chance of encountering an error is linear with the block > size. I'm contending the chance of encountering a checksum collision as a > result of encountering one or more errors is independent of the block size. > > > collisions with cryptographic hashes is hard! Granted a CRC is a lot > > simpler than SHA1 or whatever, but we also aren't facing adversaries > > with it, just random corruption. So yes, as your data block increases > > then naturally the number of possible bit patterns which match the > > same CRC have to increase — but that doesn't mean your odds of > > actually *getting* that bit pattern by mistake increase linearly. > > A (good) checksum is like rolling a 2^"number of bits in the checksum"-sided > dice across an rough table, in an ideal world where every single parameter is > known. If you launch your dice in precisely the same way, the dice will > behave exactly the same way, hitting the same hills and valleys in the table, > and end up in precisely the same spot with precisely the same face on top - > your checksum. The number of data bits is how hard you roll the dice: > how far it goes and many hills and valleys it hits along the way. > > One or more data errors (bit flips or whatever) is then equivalent to changing > one or more of the hills or valleys: a very small difference, but encountering > the difference puts the dice on a completely different path, thereafter > hitting completely different hills and valleys to the original path. And which > face is on top when your dice stops is a matter of chance (well... not really: if > you did exactly the same again, it would end up taking precisely the same > path and the same face would be on top when it stops). > > The thing is, it doesn't matter how many hills and valleys (data bits) it hits > along the way: the chance of getting a specific face up is always the same, i.e. > 1 / number_of_faces == 1 / 2^number_of_checksum_bits. > > Chris I think you're defining BER as the odds of a read operation silently delivering wrong data. Whereas I'm defining BER as the odds of an individual bit being read incorrectly. When we have a false positive you count "1" failure but I count "Block" number of failures. I'm not claiming that either of us is "correct". I'm just trying understand our positions. Do you agree with this? ��.n��������+%������w��{.n����z��u���ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f