On Thu, Sep 07, 2017 at 11:56:48PM -0400, Jeff King wrote: > > True; I do not think string_list API does. But for this particular > > application, I suspect that we can by looking at the util field of > > the item returned. A newly created one has NULL, but we always make > > it non-NULL before leaving this function. > > Yeah, I agree that would work here. > > I also wondered if we could get away with avoiding the malloc entirely > here. Especially in the "shortlog -n" case, it is identical to the name > field we already have in ident.name. So ideally we'd do a lookup to see > if we have the entry before allocating anything (since we do one lookup > per commit, but only insert once per unique author). > > But that doesn't quite work, because ident.name doesn't put to a > NUL-terminated string, and string_list only handles strings. I happened to look at this more while digging on an unrelated shortlog bug. I think the whole thing could actually be reorganized a bit. We call insert_one_record() from shortlog_add_commit(). The latter formats "%an <%ae>", only to have the former parse it back to its constituent parts. That seems rather silly. This is an artifact of shortlog's original mode, which was to parse "git log" output. But for an internal traversal, we can just format the correct item right off the bat. That part of insert_one_record() is also where we handle the mailmap mapping. But again, the internal traversal can just "%aE" to format that correctly in the first place. IOW, something like the patch below, which pushes the re-parsing out to the stdin code-path, and lets the internal traversal format directly into the final buffer. It seems to be about 3% faster than the existing code, and fixes the leak (by dropping that variable entirely). -Peff --- diff --git a/builtin/shortlog.c b/builtin/shortlog.c index 43c4799ea9..e29875b843 100644 --- a/builtin/shortlog.c +++ b/builtin/shortlog.c @@ -52,26 +52,8 @@ static void insert_one_record(struct shortlog *log, const char *oneline) { struct string_list_item *item; - const char *mailbuf, *namebuf; - size_t namelen, maillen; - struct strbuf namemailbuf = STRBUF_INIT; - struct ident_split ident; - if (split_ident_line(&ident, author, strlen(author))) - return; - - namebuf = ident.name_begin; - mailbuf = ident.mail_begin; - namelen = ident.name_end - ident.name_begin; - maillen = ident.mail_end - ident.mail_begin; - - map_user(&log->mailmap, &mailbuf, &maillen, &namebuf, &namelen); - strbuf_add(&namemailbuf, namebuf, namelen); - - if (log->email) - strbuf_addf(&namemailbuf, " <%.*s>", (int)maillen, mailbuf); - - item = string_list_insert(&log->list, namemailbuf.buf); + item = string_list_insert(&log->list, author); if (log->summary) item->util = (void *)(UTIL_TO_INT(item) + 1); @@ -114,9 +96,33 @@ static void insert_one_record(struct shortlog *log, } } +static int parse_stdin_author(struct shortlog *log, + struct strbuf *out, const char *in) +{ + const char *mailbuf, *namebuf; + size_t namelen, maillen; + struct ident_split ident; + + if (split_ident_line(&ident, in, strlen(in))) + return -1; + + namebuf = ident.name_begin; + mailbuf = ident.mail_begin; + namelen = ident.name_end - ident.name_begin; + maillen = ident.mail_end - ident.mail_begin; + + map_user(&log->mailmap, &mailbuf, &maillen, &namebuf, &namelen); + strbuf_add(out, namebuf, namelen); + if (log->email) + strbuf_addf(out, " <%.*s>", (int)maillen, mailbuf); + + return 0; +} + static void read_from_stdin(struct shortlog *log) { struct strbuf author = STRBUF_INIT; + struct strbuf mapped_author = STRBUF_INIT; struct strbuf oneline = STRBUF_INIT; static const char *author_match[2] = { "Author: ", "author " }; static const char *committer_match[2] = { "Commit: ", "committer " }; @@ -134,9 +140,15 @@ static void read_from_stdin(struct shortlog *log) while (strbuf_getline_lf(&oneline, stdin) != EOF && !oneline.len) ; /* discard blanks */ - insert_one_record(log, v, oneline.buf); + + strbuf_reset(&mapped_author); + if (parse_stdin_author(log, &mapped_author, v) < 0) + continue; + + insert_one_record(log, mapped_author.buf, oneline.buf); } strbuf_release(&author); + strbuf_release(&mapped_author); strbuf_release(&oneline); } @@ -153,7 +165,9 @@ void shortlog_add_commit(struct shortlog *log, struct commit *commit) ctx.date_mode.type = DATE_NORMAL; ctx.output_encoding = get_log_output_encoding(); - fmt = log->committer ? "%cn <%ce>" : "%an <%ae>"; + fmt = log->committer ? + (log->email ? "%cN <%cE>" : "%cN") : + (log->email ? "%aN <%aE>" : "%aN"); format_commit_message(commit, fmt, &author, &ctx); if (!log->summary) {