Re: Git Rename Detection Bug

Philip Oakley <philipoakley@iee.email> · Wed, 15 Nov 2023 16:51:53 +0000

Hi Elijah,

On 11/11/2023 05:46, Elijah Newren wrote:
> * filename similarity is extraordinarily expensive compared to exact
> renames, and if not carefully handled, can sometimes rival the cost of
> file content similarity computations given our spanhash
> representations.

I've not heard of spanhash representation before. Any references or
further reading?

>    Exact renames are tasked with finding renames even
> if they are known to not be relevant, simply because exact renames can
> do so very quickly.  If we change that, we throw a monkey wrench in
> our performance handling elsewhere and have to rethink a number of
> other things.

--
Philip