Hi Elijah, On 11/11/2023 05:46, Elijah Newren wrote: > * filename similarity is extraordinarily expensive compared to exact > renames, and if not carefully handled, can sometimes rival the cost of > file content similarity computations given our spanhash > representations. I've not heard of spanhash representation before. Any references or further reading? > Exact renames are tasked with finding renames even > if they are known to not be relevant, simply because exact renames can > do so very quickly. If we change that, we throw a monkey wrench in > our performance handling elsewhere and have to rethink a number of > other things. -- Philip