On Tue, Mar 9, 2021 at 2:21 PM Derrick Stolee <stolee@xxxxxxxxx> wrote: > > On 3/8/2021 7:09 PM, Elijah Newren via GitGitGadget wrote:> @@ -1127,9 +1137,10 @@ void diffcore_rename_extended(struct diff_options *options, > > /* > > * Cull sources: > > * - remove ones corresponding to exact renames > > + * - remove ones not found in relevant_sources > > */ > > trace2_region_enter("diff", "cull after exact", options->repo); > > - remove_unneeded_paths_from_src(want_copies); > > + remove_unneeded_paths_from_src(want_copies, relevant_sources); > > trace2_region_leave("diff", "cull after exact", options->repo); > > In this case, we are checking for copies. Perhaps there is a reason > why we want to include relevant_sources in that case, so I'll look > for it later. There is no current caller making use of this. I could have just marked the two options as incompatible and passed NULL here for relevant sources, but...I didn't see a compelling reason to. If some caller wanted copy detection and were only interested in copies from a subset of sources, they could take advantage of relevant_sources in the same way to limit to those paths that are relevant. I guess it comes down to whether you view this as designing the code for future theoretical cases, or explicitly taking steps to limit future possibilities. I don't have a strong opinion here, but tried to enable other future uses since diffcore-rename.c is used by many other callers too. > > } else { > > /* Determine minimum score to match basenames */ > > @@ -1148,7 +1159,7 @@ void diffcore_rename_extended(struct diff_options *options, > > * - remove ones involved in renames (found via exact match) > > */ > > trace2_region_enter("diff", "cull after exact", options->repo); > > - remove_unneeded_paths_from_src(want_copies); > > + remove_unneeded_paths_from_src(want_copies, NULL); > > trace2_region_leave("diff", "cull after exact", options->repo); > > > > /* Preparation for basename-driven matching. */ > > @@ -1167,9 +1178,10 @@ void diffcore_rename_extended(struct diff_options *options, > > /* > > * Cull sources, again: > > * - remove ones involved in renames (found via basenames) > > + * - remove ones not found in relevant_sources > > */ > > trace2_region_enter("diff", "cull basename", options->repo); > > - remove_unneeded_paths_from_src(want_copies); > > + remove_unneeded_paths_from_src(want_copies, relevant_sources); > > trace2_region_leave("diff", "cull basename", options->repo); > > This seems backwards from your cover letter. You are using the exact renames > _and_ basename matches to remove the unneeded paths. Why are we not stripping > out the relevant_sources in the call further up, before we call > find_basename_matches()? Yeah, good flag. I should add a comment to the commit message about how these are still needed in initialize_dir_rename_info() for basename-guided directory rename detection...and will also play a role in some of the special cases where renames are needed for more than three-way content merges. And add a comment about how by the end of the series, relevant_sources will be passed to find_basename_matches() so that it can skip over those paths.