On Thu, Mar 10, 2011 at 2:09 AM, Sim Zacks <sim@xxxxxxxxxxxxxx> wrote: > >> > The question is, if it screws up and says that an image already exists >> > and then returns a different image when querying for it, how bad would >> > that be. >> > >> >> >> It'll never happen: >> >> >> http://stackoverflow.com/questions/862346/how-do-i-assess-the-hash-collision-probability >> >> >> Sure you CAN go out of your way to generate collisions, but I'd bet >> money you never see one from your setup. >> >> The probability is extremely slim. And if thats too much of a chance, >> use sha2, its mind numbingly slim. >> >> If you were doing cryptography it would be a problem, yes, but not >> checking file equality. >> >> -Andy > > Never is a long time. The question that I asked is precisely: how much money > you would bet that you'll never hit a collision. It depends on the use case. > If you are talking about privacy issues, which can include lawsuits, loss of > reputation and/or damages, then I wouldn't take that risk, even on sha2. > Especially not with all the publicly available documentation explaining why > not to do it. If you are talking about a minor inconvenience or > professional pride because the wrong image showed up, or the right image was > never stored, then it may be worth the risk. Regardless of the intended use, I would bet every dollar I've ever made, will make, could borrow, beg steal, etc vs 1 of your dollars and happily collect it when I won the bet. See here: (http://en.wikipedia.org/wiki/Birthday_attack) and look at the table of odds vs population size...your statement is not in line with mathematical reality, and from a risk standpoint there is a large number of things to be looking at before sha2 collision such as drive bit error rates, spontaneous combustion, etc. AFAIK, even sha1 collisions have never been found in the wild, and the zfs deduplication system uses sha1 to deduplicate disk blocks, as does bit torrent. In fact many computing systems you rely on make hash safety assumptions weaker than sha2. Schneier speculates that we may see a collision soon here: http://blog.valerieaurora.org/2009/06/25/sha-1-collision-expected-within-a-year/. A small number of duplicate accidental md5 hashes have been found in the wild. merlin -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general