Re: [PATCH] shortlog: respect commit encoding

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Junio,

On Tue, Nov 24, 2009 at 05:12:14PM -0800, Junio C Hamano wrote:
> Uwe Kleine-König  <u.kleine-koenig@xxxxxxxxxxxxxx> writes:
> 
> > Before this change the author was taken from the raw commit without
> > reencoding.
> 
> I see people often begin with "before this change" and stop the log
> message after making a statement of a fact.  I mildly dislike this style,
> especially when the resulting message does not state that it is bad (and
> if necessary why it is bad) nor state in what way the code after the
> change is good.
> 
> 	Don't take the author name information without re-encoding
>         from the raw commit object buffer.
> 
> is easier to read, at least for me.
Yes, that's better.  Thanks.
 
> >  	while (*buffer && *buffer != '\n') {
> >  		const char *eol = strchr(buffer, '\n');
> >  
> > -		if (eol == NULL)
> > +		if (eol == NULL) {
> >  			eol = buffer + strlen(buffer);
> > -		else
> > +		} else
> >  			eol++;
> >  		if (!prefixcmp(buffer, "author "))
> 
> What is this hunk for?
This is just a left-over from debugging.  Removed.
 
> > @@ -157,20 +162,20 @@ void shortlog_add_commit(struct shortlog *log, struct commit *commit)
> >  		die("Missing author: %s",
> >  		    sha1_to_hex(commit->object.sha1));
> >  	if (log->user_format) {
> > -		struct strbuf buf = STRBUF_INIT;
> >  		struct pretty_print_context ctx = {0};
> >  		ctx.abbrev = DEFAULT_ABBREV;
> >  		ctx.subject = "";
> >  		ctx.after_subject = "";
> >  		ctx.date_mode = DATE_NORMAL;
> > +		pretty_print_commit(CMIT_FMT_USERFORMAT, commit, &ufbuf, &ctx);
> > +		buffer = ufbuf.buf;
> > +
> > +	} else if (*buffer)
> >  		buffer++;
> > +
> 
> You probably wanted to add an extra pair of {} around this "else
> if" clause instead, not the earlier one.
I removed the new line (the last changed line you quoted) instead.
Good?
 
> > diff --git a/t/t4201-shortlog.sh b/t/t4201-shortlog.sh
> > index 405b971..118204b 100755
> > --- a/t/t4201-shortlog.sh
> > +++ b/t/t4201-shortlog.sh
> > @@ -51,5 +51,29 @@ git log HEAD > log
> >  GIT_DIR=non-existing git shortlog -w < log > out
> >  
> >  test_expect_success 'shortlog from non-git directory' 'test_cmp expect out'
> > +iconvfromutf8toiso885915() {
> > +	printf "%s" "$@" | iconv -f UTF-8 -t ISO-8859-15
> > +}
> 
> A bad use of "$@" that expands to $# individual words; you meant
> to say "$*".
OK.
 
> Could we please have the following inside its own test, so that
> any failure while preparing the test data is caught as an error?
I put it in the test itself.  Isn't it ugly to have a test saying
something like
	
*   ok 3: prepare shortlog encoding test

?  Or is it better to see where a failure occurs?

> > +git reset --hard "$commit"
> > +git config --unset i18n.commitencoding
> > +echo 2 > a1
> > +git commit --quiet -m "set a1 to 2 and some non-ASCII chars: Äßø" --author="Jöhännës \"Dschö\" Schindëlin <Johannes.Schindelin@xxxxxx>" a1
> > +
> > +git config i18n.commitencoding "ISO-8859-15"
> > +echo 3 > a1
> > +git commit --quiet -m "$(iconvfromutf8toiso885915 "set a1 to 3 and some non-ASCII chars: áæï")" --author="$(iconvfromutf8toiso885915 "Jöhännës \"Dschö\" Schindëlin <Johannes.Schindelin@xxxxxx>")" a1
> > +git config --unset i18n.commitencoding
> > +
> > +git shortlog HEAD~2.. > out
> > +
> > +cat > expect << EOF
> > +Jöhännës "Dschö" Schindëlin (2):
> > +      set a1 to 2 and some non-ASCII chars: Äßø
> > +      set a1 to 3 and some non-ASCII chars: áæï
> > +
> > +EOF
> > +
> > +test_expect_success 'shortlog encoding' 'test_cmp expect out'
> 
> t3900-i18n-commit already uses 8859-1 so if it is not too much to
> ask, it would be much nicer to have these test work between UTF-8
> and 8859-1, not -15.
> 
> That way, I do not have to worry about breaking tests for people
> who were able to run existing iconv tests because they do not have
> working 8859-15.
OK.

Below is the updated patch.

Best regards
Uwe

------------------>8----------------------
From: Uwe Kleine-König <u.kleine-koenig@xxxxxxxxxxxxxx>
Subject: [PATCH] shortlog: respect commit encoding

Don't take the author name information without re-encoding from the raw
commit object buffer.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@xxxxxxxxxxxxxx>
Cc: Jiri Kosina <jkosina@xxxxxxx>
---
 builtin-shortlog.c  |   20 ++++++++++++--------
 t/t4201-shortlog.sh |   23 +++++++++++++++++++++++
 2 files changed, 35 insertions(+), 8 deletions(-)

diff --git a/builtin-shortlog.c b/builtin-shortlog.c
index 8aa63c7..263adc1 100644
--- a/builtin-shortlog.c
+++ b/builtin-shortlog.c
@@ -139,8 +139,13 @@ static void read_from_stdin(struct shortlog *log)
 void shortlog_add_commit(struct shortlog *log, struct commit *commit)
 {
 	const char *author = NULL, *buffer;
+	struct strbuf buf = STRBUF_INIT;
+	struct strbuf ufbuf = STRBUF_INIT;
+	struct pretty_print_context ctx = {0};
 
-	buffer = commit->buffer;
+	pretty_print_commit(CMIT_FMT_RAW, commit, &buf, &ctx);
+
+	buffer = buf.buf;
 	while (*buffer && *buffer != '\n') {
 		const char *eol = strchr(buffer, '\n');
 
@@ -157,20 +162,19 @@ void shortlog_add_commit(struct shortlog *log, struct commit *commit)
 		die("Missing author: %s",
 		    sha1_to_hex(commit->object.sha1));
 	if (log->user_format) {
-		struct strbuf buf = STRBUF_INIT;
 		struct pretty_print_context ctx = {0};
 		ctx.abbrev = DEFAULT_ABBREV;
 		ctx.subject = "";
 		ctx.after_subject = "";
 		ctx.date_mode = DATE_NORMAL;
-		pretty_print_commit(CMIT_FMT_USERFORMAT, commit, &buf, &ctx);
-		insert_one_record(log, author, buf.buf);
-		strbuf_release(&buf);
-		return;
-	}
-	if (*buffer)
+		pretty_print_commit(CMIT_FMT_USERFORMAT, commit, &ufbuf, &ctx);
+		buffer = ufbuf.buf;
+
+	} else if (*buffer)
 		buffer++;
 	insert_one_record(log, author, !*buffer ? "<none>" : buffer);
+	strbuf_release(&ufbuf);
+	strbuf_release(&buf);
 }
 
 static void get_from_rev(struct rev_info *rev, struct shortlog *log)
diff --git a/t/t4201-shortlog.sh b/t/t4201-shortlog.sh
index 405b971..03b6950 100755
--- a/t/t4201-shortlog.sh
+++ b/t/t4201-shortlog.sh
@@ -52,4 +52,27 @@ GIT_DIR=non-existing git shortlog -w < log > out
 
 test_expect_success 'shortlog from non-git directory' 'test_cmp expect out'
 
+iconvfromutf8toiso88591() {
+	printf "%s" "$*" | iconv -f UTF-8 -t ISO-8859-1
+}
+
+cat > expect << EOF
+Jöhännës "Dschö" Schindëlin (2):
+      set a1 to 2 and some non-ASCII chars: Äßø
+      set a1 to 3 and some non-ASCII chars: áæï
+
+EOF
+
+test_expect_success 'shortlog encoding' '
+git reset --hard "$commit" &&
+git config --unset i18n.commitencoding &&
+echo 2 > a1 &&
+git commit --quiet -m "set a1 to 2 and some non-ASCII chars: Äßø" --author="Jöhännës \"Dschö\" Schindëlin <Johannes.Schindelin@xxxxxx>" a1 &&
+git config i18n.commitencoding "ISO-8859-1" &&
+echo 3 > a1 &&
+git commit --quiet -m "$(iconvfromutf8toiso88591 "set a1 to 3 and some non-ASCII chars: áæï")" --author="$(iconvfromutf8toiso88591 "Jöhännës \"Dschö\" Schindëlin <Johannes.Schindelin@xxxxxx>")" a1 &&
+git config --unset i18n.commitencoding &&
+git shortlog HEAD~2.. > out &&
+test_cmp expect out'
+
 test_done
-- 
1.6.5.3

-- 
Pengutronix e.K.                              | Uwe Kleine-König            |
Industrial Linux Solutions                    | http://www.pengutronix.de/  |
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]