On Fri, Dec 20, 2024 at 05:19:47PM +0000, Jonathan Tan via GitGitGadget wrote: > The first change is to be more careful about paths using non-ASCII > characters. With these characters in mind, reverse the bits in the byte > as the least-significant bits have the highest entropy and we want to > maximize their influence. This is done with some bit manipulation that > swaps the two halves, then the quarters within those halves, and then > the bits within those quarters. Makes sense, and seems quite reasonable. > The second change is to perform hash composition operations at every > level of the path. This is done by storing a 'base' hash value that > contains the hash of the parent directory. When reaching a directory > boundary, we XOR the current level's name-hash value with a downshift of > the previous level's hash. This perturbation intends to create low-bit > distinctions for paths with the same final 16 bytes but distinct parent > directory structures. Very clever, I love this idea. Thanks, Taylor