Yann Dirson wrote:
" try unchanged files as candidate for copy detection.\n" \
+" --factorize_renames\n" \
+" factorize renames of all files of a directory.\n" \
Use dashes to separate words in arguments, please.
+#include <libgen.h>
+
+#ifdef basename
+# undef basename
+#endif
We might as well use the GNU version of basename() at least. Even if
you don't use it, I'd rather not see this bite some unwary programmer
coming along after you. Worst case scenario, sha1's won't add up if
POSIX basename alters its argument, making for one fiendishly tricky
bug to find.
+/*
+ * FIXME: we could optimize the 100%-rename case by preventing
+ * recursion to unfold what we know we would refold here.
+ * FIXME: do we want to replace linked list with sorted array ?
Either that or a hash. Most of the time seems to be spent in lookups.
With a hash you get quick lookups and reasonably quick inserts. With
a sorted array you get lower memory footprint than anything else and
can bisect your way to the right entry, which performs reasonably
close to skiplists. The sort overhead might be a deterrant factor,
but insofar as I understand it trees are always sorted in git anyway,
so perhaps that'd be the best solution.
Apart from that, I'd need to apply the patch to review it properly,
and I'm far too tired for that now.
I like the direction this is going though, so thanks a lot for doing
it :)
--
Andreas Ericsson andreas.ericsson@xxxxxx
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html