Alexey Shumkin <Alex.Crezoff@xxxxxxxxx> writes: Subject: Re: [PATCH v5 5/5] pretty: user format ignores i18n.logOutputEncoding setting That is a statement of fact, and does not tell much to the reader. I think you are saying that in the current implementation, logoutputencoding is not honored in user format, that it is a bug that needs to be fixed (as opposed to "that is by design, the scripts that read from --format='' is responsible for doing the conversion"), and that this patch fixes it. So pretty: --format output should honor logOutputEncoding or something? At least it says what is the _desired_ outcome with "should", hints that the status-quo is different from that desired outcome and implies that this is a patch to fix it. The "Subject" line is very important as that is the only thing we see in many summarizing output format, e.g. shortlog, cover letter and merge message. > The following two commands are expected to give the same output to a terminal: > > $ git log --oneline --no-color > $ git log --pretty=format:'%h %s' > > However, the former pays attention to i18n.logOutputEncoding > configuration, while the latter does not when it format "%s". > Log messages written in an encoding i18n.commitEncoding which differs > from terminal encoding are shown corrupted with the latter even > when i18n.logOutputEncoding and terminal encoding are the same. > > The same corruption is true for > $ git diff --submodule=log > and > $ git rev-list --pretty=format:%s HEAD > and > $ git reset --hard > > Signed-off-by: Alexey Shumkin <Alex.Crezoff@xxxxxxxxx> > --- > builtin/reset.c | 8 ++++++-- > builtin/rev-list.c | 1 + > builtin/shortlog.c | 1 + > log-tree.c | 1 + > submodule.c | 3 +++ > t/t4041-diff-submodule-option.sh | 10 +++++----- > t/t4205-log-pretty-formats.sh | 34 +++++++++++++++++----------------- > t/t6006-rev-list-format.sh | 8 ++++---- > t/t7102-reset.sh | 2 +- > 9 files changed, 39 insertions(+), 29 deletions(-) > > diff --git a/builtin/reset.c b/builtin/reset.c > index 6032131..b23ed63 100644 > --- a/builtin/reset.c > +++ b/builtin/reset.c > @@ -92,11 +92,15 @@ static int reset_index(const unsigned char *sha1, int reset_type, int quiet) > > static void print_new_head_line(struct commit *commit) > { > - const char *hex, *body; > + const char *hex, *body, *encoding; > > hex = find_unique_abbrev(commit->object.sha1, DEFAULT_ABBREV); > printf(_("HEAD is now at %s"), hex); > - body = strstr(commit->buffer, "\n\n"); > + encoding = get_log_output_encoding(); > + body = logmsg_reencode(commit, NULL, encoding); > + if (!body) > + body = commit->buffer; Does this happen? I thought body, without an error, can be the same as commit->buffer. > + body = strstr(body, "\n\n"); > if (body) { > const char *eol; > size_t len; Do we have a leak here? body may point at a new piece of memory logmsg_reencode() have allocated to hold the UTF-8 version of your 8859-5 message in commit->buffer. It would be more like this, no? diff --git a/builtin/reset.c b/builtin/reset.c index 6032131..8d22ffe 100644 --- a/builtin/reset.c +++ b/builtin/reset.c @@ -92,11 +92,13 @@ static int reset_index(const unsigned char *sha1, int reset_type, int quiet) static void print_new_head_line(struct commit *commit) { - const char *hex, *body; + const char *hex, *body, *msg, *encoding; hex = find_unique_abbrev(commit->object.sha1, DEFAULT_ABBREV); printf(_("HEAD is now at %s"), hex); - body = strstr(commit->buffer, "\n\n"); + + msg = logmsg_reencode(commit, NULL, get_log_output_encoding()); + body = strstr(msg, "\n\n"); if (body) { const char *eol; size_t len; @@ -107,6 +109,7 @@ static void print_new_head_line(struct commit *commit) } else printf("\n"); + logmsg_free(msg, commit); } static void update_index_from_diff(struct diff_queue_struct *q, -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html