remove_duplicates() in builtin/fetch-pack.c is O(N^2)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I just noticed that the remove_duplicates() function in builtin/fetch-pack.c is O(N^2) in the number of heads. Empirically, this function takes on the order of 25 seconds to process 100k references.

I know that 100k heads is kindof absurd. Perhaps handling this many heads is unrealistic for other reasons. But I vaguely recall numbers like this being mentioned on the mailing list.

It would be pretty trivial to reduce the work to O(N) by using a hash set to keep track of the references that have already been seen.

I don't plan to work on this, but I thought I would point it out in case it is causing somebody pain.

Michael

--
Michael Haggerty
mhagger@xxxxxxxxxxxx
http://softwareswirl.blogspot.com/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]