Re: [PATCH 1/1] contrib: add coverage-diff script

Junio C Hamano <gitster@xxxxxxxxx> · Wed, 12 Sep 2018 15:13:30 -0700

"Derrick Stolee via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes:

>  contrib/coverage-diff.sh | 70 ++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 70 insertions(+)
>  create mode 100755 contrib/coverage-diff.sh

I fully appreciate the motivation.  But it is a bit sad that this
begins with "#!/bin/bash" but it seems that the script is full of
bash-isms.  I haven't gone through the script to see if these are
inevitable or gratuitous yet, but I'd assume it made it easier for
you to write it to step outside the pure POSIX shell?

> +V1=$1
> +V2=$2
> +
> +diff-lines() {

Being able to use '-' in identifier is probably a bash-ism that you
did not have to use.

> +    local path=
> +    local line=
> +    while read; do

Being able to omit variable to be read into and can use the implicit
variable $REPLY also is.

> +	esc=$'\033'
> +	if [[ $REPLY =~ ---\ (a/)?.* ]]; then
> +	    continue
> +	elif [[ $REPLY =~ \+\+\+\ (b/)?([^[:blank:]$esc]+).* ]]; then
> +	    path=${BASH_REMATCH[2]}

OK, it probably is easier to write in bash than using expr if you
want to do regexp.  Where do these escape code come from in "git
diff" output, by the way?

> +	elif [[ $REPLY =~ @@\ -[0-9]+(,[0-9]+)?\ \+([0-9]+)(,[0-9]+)?\ @@.* ]]; then
> +	    line=${BASH_REMATCH[2]}
> +	elif [[ $REPLY =~ ^($esc\[[0-9;]+m)*([\ +-]) ]]; then
> +	    echo "$path:$line:$REPLY"
> +	    if [[ ${BASH_REMATCH[2]} != - ]]; then
> +		((line++))
> +	    fi
> +	fi
> +    done
> +}
> +
> +git diff --raw $V1 $V2 | grep \.c$ | awk 'NF>1{print $NF}' >files.txt

Hmph, not 

	git diff --name-only "$V1" "$V2" -- "*.c"

Do we (or do we not) want "--no-renames"?

> +for file in $(cat files.txt)
> +do
> +	hash_file=${file//\//\#}
> +
> +	git diff $V1 $V2 -- $file \
> +		| diff-lines \
> +		| grep ":+" \
> +		>"diff_file.txt"

Style:

	cmd1 |
	cmd2 |
	cmd3 >output

is easier to read without backslashes.

> +	cat diff_file.txt \
> +		| sed -E 's/:/ /g' \
> +		| awk '{print $2}' \
> +		| sort \
> +		>new_lines.txt
> +
> +	cat "$hash_file.gcov" \
> +		| grep \#\#\#\#\# \
> +		| sed 's/    #####: //g' \
> +		| sed 's/\:/ /g' \
> +		| awk '{print $1}' \
> +		| sort \
> +		>uncovered_lines.txt

OK, so we assume that we have run coverage in $V2 checkout so that
we can pick up the postimage line numbers in "diff $V1 $V2" and find
corresponding record in .gcov file in the filesystem.  I did not
realize the significance of 'topic' being the later argument to the
script in this part

    After creating the coverage statistics at a version (say,
    'topic') you can then run

        contrib/coverage-diff.sh base topic

of your description before I see this implementation.  Also the
comment at the beginning

    # Usage: 'contrib/coverage-diff.sh <version1> <version2>
    # Outputs a list of new lines in version2 compared to version1 that are
    # not covered by the test suite. Assumes you ran
    # 'make coverage-test coverage-report' from root first, so .gcov files exist.

would want to make it clear that we want coverage run from root
for version2 before using this script.

> +	comm -12 uncovered_lines.txt new_lines.txt \
> +		| sed -e 's/$/\)/' \
> +		| sed -e 's/^/\t/' \
> +		>uncovered_new_lines.txt
> +
> +	grep -q '[^[:space:]]' < uncovered_new_lines.txt && \

Style: when you end a line with && (or || or | for that matter), the
shell knows that you have not finished speaking, and will wait to
listen to you to finish the sentence.  No need for backslash there.

> +		echo $file && \
> +		git blame -c $file \
> +			| grep -f uncovered_new_lines.txt
> +
> +	rm -f diff_file.txt new_lines.txt \
> +		uncovered_lines.txt uncovered_new_lines.txt
> +done
> +
> +rm -rf files.txt