On 2/9/2021 6:32 AM, Elijah Newren via GitGitGadget wrote: > From: Elijah Newren <newren@xxxxxxxxx> > > The last few patches have introduced a new preliminary step when rename > detection is on but both break detection and copy detection are off. > Document this new step. While we're at it, add a testcase that checks > the new behavior as well. Thanks for adding this documentation and test. > +Note that when rename detection is on but both copy and break > +detection are off, rename detection adds a preliminary step that first > +checks files with the same basename. If files with the same basename I find myself wanting a definition of 'basename' here, but perhaps I'm just being pedantic. A quick search clarifies this as a standard term [1] of which I was just ignorant. [1] https://man7.org/linux/man-pages/man3/basename.3.html > +are sufficiently similar, it will mark them as renames and exclude > +them from the later quadratic step (the one that pairwise compares all > +unmatched files to find the "best" matches, determined by the highest > +content similarity). So, for example, if docs/extensions.txt and > +docs/config/extensions.txt have similar content, then they will be > +marked as a rename even if it turns out that docs/extensions.txt was > +more similar to src/extension-checks.c. At most, one comparison is > +done per file in this preliminary pass; so if there are several > +extensions.txt files throughout the directory hierarchy that were > +added and deleted, this preliminary step will be skipped for those > +files. > +test_expect_success 'basename similarity vs best similarity' ' > + mkdir subdir && > + test_write_lines line1 line2 line3 line4 line5 \ > + line6 line7 line8 line9 line10 >subdir/file.txt && > + git add subdir/file.txt && > + git commit -m "base txt" && > + > + git rm subdir/file.txt && > + test_write_lines line1 line2 line3 line4 line5 \ > + line6 line7 line8 >file.txt && > + test_write_lines line1 line2 line3 line4 line5 \ > + line6 line7 line8 line9 >file.md && > + git add file.txt file.md && > + git commit -a -m "rename" && > + git diff-tree -r -M --name-status HEAD^ HEAD >actual && > + # subdir/file.txt is 89% similar to file.md, 78% similar to file.txt, > + # but since same basenames are checked first... > + cat >expected <<-\EOF && > + A file.md > + R078 subdir/file.txt file.txt > + EOF > + test_cmp expected actual > +' > + I appreciate the additional comments in this test to make it clear what you are testing. A minor nit is that the test could have been added at the start of the series to document the _old_ behavior. The 'expected' file would have this content: + cat >expected <<-\EOF && + A file.txt + R078 subdir/file.txt file.md + EOF Then, this test case would change the expected output in the same patch that introduces the behavior change. Thanks, -Stolee