On 2/23/2021 6:44 PM, Elijah Newren via GitGitGadget wrote: > For the testcases mentioned in commit 557ac0350d ("merge-ort: begin > performance work; instrument with trace2_region_* calls", 2020-10-28), > this change improves the performance as follows: > > Before After > no-renames: 12.775 s ± 0.062 s 12.596 s ± 0.061 s > mega-renames: 188.754 s ± 0.284 s 130.465 s ± 0.259 s > just-one-mega: 5.599 s ± 0.019 s 3.958 s ± 0.010 s Hooray! > + for (i = 0; i < rename_src_nr; ++i) { > + char *filename = rename_src[i].p->one->path; > + const char *base = NULL; > + intptr_t src_index; > intptr_t dst_index; > > + /* Is this basename unique among remaining sources? */ This comment sent me down a confusing direction. Perhaps, we can instead say: /* * If the basename is unique among remaining sources, then * src_index will equal 'i' and we can attempt to match it * to a unique basename in the destinations. Otherwise, use * directory rename heuristics, if possible. */ > + base = get_basename(filename); > + src_index = strintmap_get(&sources, base); > + assert(src_index == -1 || src_index == i); > + > + if (strintmap_contains(&dests, base)) { > struct diff_filespec *one, *two; > int score; > > + /* Find a matching destination, if possible */ > + dst_index = strintmap_get(&dests, base); > + if (src_index == -1 || dst_index == -1) { > + src_index = i; > + dst_index = idx_possible_rename(filename, info); > + } It is important that 'src_index == i' from this point on, no matter whether it was unique or not. > + if (dst_index == -1) > + continue; > + > + /* Ignore this dest if already used in a rename */ > + if (rename_dst[dst_index].is_rename) > + continue; /* already used previously */ > + This seems to match all of the complicated special cases. Thanks, -Stolee