Re: [PATCH v2] clone: report duplicate entries on case-insensitive filesystems

Jeff King <peff@xxxxxxxx> · Thu, 9 Aug 2018 17:59:13 -0400

On Thu, Aug 09, 2018 at 02:53:42PM -0700, Elijah Newren wrote:

> On Thu, Aug 9, 2018 at 2:44 PM Jeff King <peff@xxxxxxxx> wrote:
> > > The error message isn't quite as good, but does the user really need
> > > all the names of the file?  If so, we gave them enough information to
> > > figure it out, and this is a really unusual case anyway, right?
> > > Besides, now we're back to linear performance....
> >
> > Well, it's still quadratic when they run O(n) iterations of "git
> > ls-files -s | grep $colliding_oid". You've just pushed the second linear
> > search onto the user. ;)
> 
> Wouldn't that be their own fault for not running
>   git ls-files -s | grep -e $colliding_oid_1 ... -e $colliding_oid_n | sort -k 2
> ?   ;-)

Man, this thread is the gift that keeps on giving. :)

That's still quadratic, isn't it? You've just hidden the second
dimension in the single grep call.

Now since these are all going to be constant strings, in theory an
intelligent grep could stick them all in a search trie, and match each
line with complexity k, the length of the matched strings. And since
k=40, that's technically still linear overall.

-Peff