Re: [PATCH v2 4/4] gitdiffcore doc: mention new preliminary step for rename detection

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2/9/2021 6:32 AM, Elijah Newren via GitGitGadget wrote:
> From: Elijah Newren <newren@xxxxxxxxx>
> 
> The last few patches have introduced a new preliminary step when rename
> detection is on but both break detection and copy detection are off.
> Document this new step.  While we're at it, add a testcase that checks
> the new behavior as well.

Thanks for adding this documentation and test.

> +Note that when rename detection is on but both copy and break
> +detection are off, rename detection adds a preliminary step that first
> +checks files with the same basename.  If files with the same basename

I find myself wanting a definition of 'basename' here, but perhaps I'm
just being pedantic. A quick search clarifies this as a standard term [1]
of which I was just ignorant.

[1] https://man7.org/linux/man-pages/man3/basename.3.html

> +are sufficiently similar, it will mark them as renames and exclude
> +them from the later quadratic step (the one that pairwise compares all
> +unmatched files to find the "best" matches, determined by the highest
> +content similarity).  So, for example, if docs/extensions.txt and
> +docs/config/extensions.txt have similar content, then they will be
> +marked as a rename even if it turns out that docs/extensions.txt was
> +more similar to src/extension-checks.c.  At most, one comparison is
> +done per file in this preliminary pass; so if there are several
> +extensions.txt files throughout the directory hierarchy that were
> +added and deleted, this preliminary step will be skipped for those
> +files.

> +test_expect_success 'basename similarity vs best similarity' '
> +	mkdir subdir &&
> +	test_write_lines line1 line2 line3 line4 line5 \
> +			 line6 line7 line8 line9 line10 >subdir/file.txt &&
> +	git add subdir/file.txt &&
> +	git commit -m "base txt" &&
> +
> +	git rm subdir/file.txt &&
> +	test_write_lines line1 line2 line3 line4 line5 \
> +			  line6 line7 line8 >file.txt &&
> +	test_write_lines line1 line2 line3 line4 line5 \
> +			  line6 line7 line8 line9 >file.md &&
> +	git add file.txt file.md &&
> +	git commit -a -m "rename" &&
> +	git diff-tree -r -M --name-status HEAD^ HEAD >actual &&
> +	# subdir/file.txt is 89% similar to file.md, 78% similar to file.txt,
> +	# but since same basenames are checked first...
> +	cat >expected <<-\EOF &&
> +	A	file.md
> +	R078	subdir/file.txt	file.txt
> +	EOF
> +	test_cmp expected actual
> +'
> +

I appreciate the additional comments in this test to make it clear
what you are testing. A minor nit is that the test could have been
added at the start of the series to document the _old_ behavior.
The 'expected' file would have this content:

+	cat >expected <<-\EOF &&
+	A	file.txt
+	R078	subdir/file.txt	file.md
+	EOF

Then, this test case would change the expected output in the same
patch that introduces the behavior change.

Thanks,
-Stolee




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux