"Derrick Stolee via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes: > This is not meant to be cryptographic at all, but uniformly distributed > across the possible hash values. This creates a hash that appears > pseudorandom. There is no ability to consider similar file types as > being close to each other. Another consideration we had when designing the current mechanism, which is more important than "compare .c files with each other", is to handle the case where a file is moved across directory boundary without changing its name. These "hash collissions" are meant to be a part of obtaining _good_ paring of blobs that ought to be similar to each other. In other words, we wanted them to collide so that we do not have to be negatively affected by moves. I am not saying that we should not update the pack name hash; I am just saying that "consider similar file types" as if that is the most important aspect of the current hash, is misleading. Thanks.