Derrick Stolee <stolee@xxxxxxxxx> writes: > THIS IS A BREAKING CHANGE. Commit-graph files with changed-path > Bloom filters computed by a previous commit will not be compatible > with the filters computed in this commit, nor will we get correct > results when testing across these incompatible versions. Normally, > this would be a completely unacceptable change, but the filters > have not been released and hence are still possible to update > before release. Sure, it hasn't even hit 'next' yet. But I think we are both sort-of leaning towards concluding that it does not help all that much. So I think it is OK. > TODO: If we decide to move in this direction, then the following > steps should be done (and some of them should be done anyway): Even if we decide not to do this "downcase before hashing" thing, we should document how exactly we compute, I think. And if we decide do change our mind later, it is not the end of the world. We should be able to use a different chunk type to store the filters computed over downcased paths. > * We need to document the Bloom filter format to specify exactly > how we compute the filter data. The details should be careful > enough that someone can reproduce the exact file format without > looking at the C code. > > * That document would include the tolower() transformation that is > being done here. As the tree-diff comparison done internally while running "git blame" does not take an end-user specified pathspec in any meaningful way, this does not matter in practice, but there is another possible complication we would want to consider when we extend the support to "git log" and friends---negative pathspecs (e.g. "git log ':(exclude)COPYING'"). A commit that cannot possibly have touched the COPYING file would be eligible for output without actually running tree-diff between it and its parent (as long as the trees of the two commits are different, we know it must be shown). Thanks.