Hello Junio, On Tue, Nov 24, 2009 at 05:12:14PM -0800, Junio C Hamano wrote: > Uwe Kleine-König <u.kleine-koenig@xxxxxxxxxxxxxx> writes: > > > Before this change the author was taken from the raw commit without > > reencoding. > > I see people often begin with "before this change" and stop the log > message after making a statement of a fact. I mildly dislike this style, > especially when the resulting message does not state that it is bad (and > if necessary why it is bad) nor state in what way the code after the > change is good. > > Don't take the author name information without re-encoding > from the raw commit object buffer. > > is easier to read, at least for me. Yes, that's better. Thanks. > > while (*buffer && *buffer != '\n') { > > const char *eol = strchr(buffer, '\n'); > > > > - if (eol == NULL) > > + if (eol == NULL) { > > eol = buffer + strlen(buffer); > > - else > > + } else > > eol++; > > if (!prefixcmp(buffer, "author ")) > > What is this hunk for? This is just a left-over from debugging. Removed. > > @@ -157,20 +162,20 @@ void shortlog_add_commit(struct shortlog *log, struct commit *commit) > > die("Missing author: %s", > > sha1_to_hex(commit->object.sha1)); > > if (log->user_format) { > > - struct strbuf buf = STRBUF_INIT; > > struct pretty_print_context ctx = {0}; > > ctx.abbrev = DEFAULT_ABBREV; > > ctx.subject = ""; > > ctx.after_subject = ""; > > ctx.date_mode = DATE_NORMAL; > > + pretty_print_commit(CMIT_FMT_USERFORMAT, commit, &ufbuf, &ctx); > > + buffer = ufbuf.buf; > > + > > + } else if (*buffer) > > buffer++; > > + > > You probably wanted to add an extra pair of {} around this "else > if" clause instead, not the earlier one. I removed the new line (the last changed line you quoted) instead. Good? > > diff --git a/t/t4201-shortlog.sh b/t/t4201-shortlog.sh > > index 405b971..118204b 100755 > > --- a/t/t4201-shortlog.sh > > +++ b/t/t4201-shortlog.sh > > @@ -51,5 +51,29 @@ git log HEAD > log > > GIT_DIR=non-existing git shortlog -w < log > out > > > > test_expect_success 'shortlog from non-git directory' 'test_cmp expect out' > > +iconvfromutf8toiso885915() { > > + printf "%s" "$@" | iconv -f UTF-8 -t ISO-8859-15 > > +} > > A bad use of "$@" that expands to $# individual words; you meant > to say "$*". OK. > Could we please have the following inside its own test, so that > any failure while preparing the test data is caught as an error? I put it in the test itself. Isn't it ugly to have a test saying something like * ok 3: prepare shortlog encoding test ? Or is it better to see where a failure occurs? > > +git reset --hard "$commit" > > +git config --unset i18n.commitencoding > > +echo 2 > a1 > > +git commit --quiet -m "set a1 to 2 and some non-ASCII chars: Äßø" --author="Jöhännës \"Dschö\" Schindëlin <Johannes.Schindelin@xxxxxx>" a1 > > + > > +git config i18n.commitencoding "ISO-8859-15" > > +echo 3 > a1 > > +git commit --quiet -m "$(iconvfromutf8toiso885915 "set a1 to 3 and some non-ASCII chars: áæï")" --author="$(iconvfromutf8toiso885915 "Jöhännës \"Dschö\" Schindëlin <Johannes.Schindelin@xxxxxx>")" a1 > > +git config --unset i18n.commitencoding > > + > > +git shortlog HEAD~2.. > out > > + > > +cat > expect << EOF > > +Jöhännës "Dschö" Schindëlin (2): > > + set a1 to 2 and some non-ASCII chars: Äßø > > + set a1 to 3 and some non-ASCII chars: áæï > > + > > +EOF > > + > > +test_expect_success 'shortlog encoding' 'test_cmp expect out' > > t3900-i18n-commit already uses 8859-1 so if it is not too much to > ask, it would be much nicer to have these test work between UTF-8 > and 8859-1, not -15. > > That way, I do not have to worry about breaking tests for people > who were able to run existing iconv tests because they do not have > working 8859-15. OK. Below is the updated patch. Best regards Uwe ------------------>8---------------------- From: Uwe Kleine-König <u.kleine-koenig@xxxxxxxxxxxxxx> Subject: [PATCH] shortlog: respect commit encoding Don't take the author name information without re-encoding from the raw commit object buffer. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@xxxxxxxxxxxxxx> Cc: Jiri Kosina <jkosina@xxxxxxx> --- builtin-shortlog.c | 20 ++++++++++++-------- t/t4201-shortlog.sh | 23 +++++++++++++++++++++++ 2 files changed, 35 insertions(+), 8 deletions(-) diff --git a/builtin-shortlog.c b/builtin-shortlog.c index 8aa63c7..263adc1 100644 --- a/builtin-shortlog.c +++ b/builtin-shortlog.c @@ -139,8 +139,13 @@ static void read_from_stdin(struct shortlog *log) void shortlog_add_commit(struct shortlog *log, struct commit *commit) { const char *author = NULL, *buffer; + struct strbuf buf = STRBUF_INIT; + struct strbuf ufbuf = STRBUF_INIT; + struct pretty_print_context ctx = {0}; - buffer = commit->buffer; + pretty_print_commit(CMIT_FMT_RAW, commit, &buf, &ctx); + + buffer = buf.buf; while (*buffer && *buffer != '\n') { const char *eol = strchr(buffer, '\n'); @@ -157,20 +162,19 @@ void shortlog_add_commit(struct shortlog *log, struct commit *commit) die("Missing author: %s", sha1_to_hex(commit->object.sha1)); if (log->user_format) { - struct strbuf buf = STRBUF_INIT; struct pretty_print_context ctx = {0}; ctx.abbrev = DEFAULT_ABBREV; ctx.subject = ""; ctx.after_subject = ""; ctx.date_mode = DATE_NORMAL; - pretty_print_commit(CMIT_FMT_USERFORMAT, commit, &buf, &ctx); - insert_one_record(log, author, buf.buf); - strbuf_release(&buf); - return; - } - if (*buffer) + pretty_print_commit(CMIT_FMT_USERFORMAT, commit, &ufbuf, &ctx); + buffer = ufbuf.buf; + + } else if (*buffer) buffer++; insert_one_record(log, author, !*buffer ? "<none>" : buffer); + strbuf_release(&ufbuf); + strbuf_release(&buf); } static void get_from_rev(struct rev_info *rev, struct shortlog *log) diff --git a/t/t4201-shortlog.sh b/t/t4201-shortlog.sh index 405b971..03b6950 100755 --- a/t/t4201-shortlog.sh +++ b/t/t4201-shortlog.sh @@ -52,4 +52,27 @@ GIT_DIR=non-existing git shortlog -w < log > out test_expect_success 'shortlog from non-git directory' 'test_cmp expect out' +iconvfromutf8toiso88591() { + printf "%s" "$*" | iconv -f UTF-8 -t ISO-8859-1 +} + +cat > expect << EOF +Jöhännës "Dschö" Schindëlin (2): + set a1 to 2 and some non-ASCII chars: Äßø + set a1 to 3 and some non-ASCII chars: áæï + +EOF + +test_expect_success 'shortlog encoding' ' +git reset --hard "$commit" && +git config --unset i18n.commitencoding && +echo 2 > a1 && +git commit --quiet -m "set a1 to 2 and some non-ASCII chars: Äßø" --author="Jöhännës \"Dschö\" Schindëlin <Johannes.Schindelin@xxxxxx>" a1 && +git config i18n.commitencoding "ISO-8859-1" && +echo 3 > a1 && +git commit --quiet -m "$(iconvfromutf8toiso88591 "set a1 to 3 and some non-ASCII chars: áæï")" --author="$(iconvfromutf8toiso88591 "Jöhännës \"Dschö\" Schindëlin <Johannes.Schindelin@xxxxxx>")" a1 && +git config --unset i18n.commitencoding && +git shortlog HEAD~2.. > out && +test_cmp expect out' + test_done -- 1.6.5.3 -- Pengutronix e.K. | Uwe Kleine-König | Industrial Linux Solutions | http://www.pengutronix.de/ | -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html