Re: [PATCH] shortlog: respect commit encoding

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Uwe Kleine-König  <u.kleine-koenig@xxxxxxxxxxxxxx> writes:

> Before this change the author was taken from the raw commit without
> reencoding.

I see people often begin with "before this change" and stop the log
message after making a statement of a fact.  I mildly dislike this style,
especially when the resulting message does not state that it is bad (and
if necessary why it is bad) nor state in what way the code after the
change is good.

	Don't take the author name information without re-encoding
        from the raw commit object buffer.

is easier to read, at least for me.

>  	while (*buffer && *buffer != '\n') {
>  		const char *eol = strchr(buffer, '\n');
>  
> -		if (eol == NULL)
> +		if (eol == NULL) {
>  			eol = buffer + strlen(buffer);
> -		else
> +		} else
>  			eol++;
>  		if (!prefixcmp(buffer, "author "))

What is this hunk for?

> @@ -157,20 +162,20 @@ void shortlog_add_commit(struct shortlog *log, struct commit *commit)
>  		die("Missing author: %s",
>  		    sha1_to_hex(commit->object.sha1));
>  	if (log->user_format) {
> -		struct strbuf buf = STRBUF_INIT;
>  		struct pretty_print_context ctx = {0};
>  		ctx.abbrev = DEFAULT_ABBREV;
>  		ctx.subject = "";
>  		ctx.after_subject = "";
>  		ctx.date_mode = DATE_NORMAL;
> +		pretty_print_commit(CMIT_FMT_USERFORMAT, commit, &ufbuf, &ctx);
> +		buffer = ufbuf.buf;
> +
> +	} else if (*buffer)
>  		buffer++;
> +

You probably wanted to add an extra pair of {} around this "else
if" clause instead, not the earlier one.

Otherwise the change looks good from my cursory look.

> diff --git a/t/t4201-shortlog.sh b/t/t4201-shortlog.sh
> index 405b971..118204b 100755
> --- a/t/t4201-shortlog.sh
> +++ b/t/t4201-shortlog.sh
> @@ -51,5 +51,29 @@ git log HEAD > log
>  GIT_DIR=non-existing git shortlog -w < log > out
>  
>  test_expect_success 'shortlog from non-git directory' 'test_cmp expect out'
> +iconvfromutf8toiso885915() {
> +	printf "%s" "$@" | iconv -f UTF-8 -t ISO-8859-15
> +}

A bad use of "$@" that expands to $# individual words; you meant
to say "$*".

Could we please have the following inside its own test, so that
any failure while preparing the test data is caught as an error?

> +git reset --hard "$commit"
> +git config --unset i18n.commitencoding
> +echo 2 > a1
> +git commit --quiet -m "set a1 to 2 and some non-ASCII chars: Äßø" --author="Jöhännës \"Dschö\" Schindëlin <Johannes.Schindelin@xxxxxx>" a1
> +
> +git config i18n.commitencoding "ISO-8859-15"
> +echo 3 > a1
> +git commit --quiet -m "$(iconvfromutf8toiso885915 "set a1 to 3 and some non-ASCII chars: áæï")" --author="$(iconvfromutf8toiso885915 "Jöhännës \"Dschö\" Schindëlin <Johannes.Schindelin@xxxxxx>")" a1
> +git config --unset i18n.commitencoding
> +
> +git shortlog HEAD~2.. > out
> +
> +cat > expect << EOF
> +Jöhännës "Dschö" Schindëlin (2):
> +      set a1 to 2 and some non-ASCII chars: Äßø
> +      set a1 to 3 and some non-ASCII chars: áæï
> +
> +EOF
> +
> +test_expect_success 'shortlog encoding' 'test_cmp expect out'

t3900-i18n-commit already uses 8859-1 so if it is not too much to
ask, it would be much nicer to have these test work between UTF-8
and 8859-1, not -15.

That way, I do not have to worry about breaking tests for people
who were able to run existing iconv tests because they do not have
working 8859-15.

Thanks
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]