#1) the seen field is wrapping, char is too small. #2) static void rev_tag_search (rev_ref **tags, int ntag, rev_ref *tag, rev_list *rl) { rev_commit **commits = calloc (ntag, sizeof (rev_commit *)); int n; for (n = 0; n < ntag; n++) commits[n] = tags[n]->commit; ntag = rev_commit_date_sort (commits, ntag); tag->parent = rev_branch_of_commit (rl, commits[0]); if (tag->parent) tag->commit = rev_commit_locate (tag->parent, commits[0]); if (!tag->commit) { tag->commit = rev_commit_build (commits, ntag);
everything from here is leaking
} free (commits); } #3) cvs_find_symbol is very hot, probably should be a hash. Mozilla has thousands of symbols. #4) rcs2git is n-squared and parses the file over and over to get the revs. The n-squared really hurts when a ,v file is 45MB. A single pass algorithm would work wonders. Or at least cache the offsets to the revs as they are found. Can the revs be written straight to a pack file and then connected up with a tree later? #5) This small group fails, parsecvs sends a null to git. Git dies. /home/mozcvs/mozilla/Makefile.in,v /home/mozcvs/mozilla/.cvsignore,v /home/mozcvs/mozilla/LEGAL,v /home/mozcvs/mozilla/LICENSE,v /home/mozcvs/mozilla/README.txt,v /home/mozcvs/mozilla/aclocal.m4,v /home/mozcvs/mozilla/camino.mk,v #6) comparing versions is very hot. Could versions be encoded into a long or long long for more efficient comparisons? Packed bit field unioned with long. Checks on initial parsing to make sure fields don't over flow. I'm working on some global analysis to try and track down the missing branch tags but it is very slow going. It would be better to speed up the basic process first. -- Jon Smirl jonsmirl@xxxxxxxxx - : send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html