Re: [PATCH 1/2] log: grep author/committer using mailmap

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



>>
>> +static int commit_rewrite_authors(struct strbuf *buf, const char *what, struct string_list *mailmap)
>> +{
>> +     char *author, *endp;
>> +     size_t len;
>> +     struct strbuf name = STRBUF_INIT;
>> +     struct strbuf mail = STRBUF_INIT;
>> +     struct ident_split ident;
>> +
>> +     author = strstr(buf->buf, what);
>> +     if (!author)
>> +             goto error;
>
> This does not stop at the end of the header part and would match a
> random line in the log message that happens to begin with "author ";
> is this something we would worry about, or would we leave it to "fsck"?

The only worrying case would be:
 - commit doesn't have "\nauthor" in the header (can that happen
without corruption?)
 - commit has "\nauthor" in the commit log
 - This line from commit log contains an <email> (split_ident_line works)
Then, I guess it's going to replace the name in the commit log.

Otherwise, it would not replace anything, as there is no author to
replace anyway.

It looks like most mechanisms using mailmap would have the same issue.

>> +     author += strlen(what);
>> +     endp = strstr(author, "\n");
>
> Using strchr(author, '\n') would feel more natural.  Also rename
> "author" to "person" or something, as you would be using this
> function for the committer information as well?

Both fixed

>> +     if (!endp)
>> +             goto error;
>> +
>> +     len = endp - author;
>> +
>> +     if (split_ident_line(&ident, author, len)) {
>> +     error:
>> +             strbuf_release(&name);
>> +             strbuf_release(&mail);
>> +
>> +             return 1;
>
> We usually signal error by returning a negative integer.  It does
> not matter too much in this case as no callers seem to check the
> return value from this function, though.

Fixed, or would you rather see it `void` ?

>> +     }
>> +
>> +     strbuf_add(&name, ident.name_begin, ident.name_end - ident.name_begin);
>> +     strbuf_add(&mail, ident.mail_begin, ident.mail_end - ident.mail_begin);
>> +
>> +     map_user(mailmap, &mail, &name);
>> +
>> +     strbuf_addf(&name, " <%s>", mail.buf);
>> +
>> +     strbuf_splice(buf, ident.name_begin - buf->buf,
>> +                   ident.mail_end - ident.name_begin + 1,
>> +                   name.buf, name.len);
>
> Would it give us better performance if we splice only when
> map_user() tells us that we actually rewrote the ident?

My intuition was that the cost of splice belongs to "memoving", when the
size is different. Yet, Fixed, as it removes two copies.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]