Re: [bug] blame duplicates trailing ">" in mailmapped emails

Felipe Contreras <felipe.contreras@xxxxxxxxx> · Mon, 6 Feb 2012 14:14:56 +0200

On Mon, Feb 6, 2012 at 5:16 AM, Junio C Hamano <gitster@xxxxxxxxx> wrote:
> Jeff King <peff@xxxxxxxx> writes:
>
>> Ugh, yeah. I was thinking about how it would improve this call site, but
>> I don't want to get into auditing the others. Let's drop it and go with
>> your patch.
>
> In any case, here is what I queued for tonight.
>
> -- >8 --
> Subject: [PATCH] mailmap: do not leave '>' in the output when answering "we did something"

This subject doesn't explain the *purpose* of the patch: always return
a plain mail address from map_user()

That would be enough for most readers.

I think the immediate problem should be here:

Currently 'git blame -e' would add an extra '>' if map_user() returns
true, which would end up as '<foo@xxxxxxx>>'. This is because
map_user() sometimes modifies, the mail string, but sometimes not. So
let's always modify it.

At this point a lot of readers can skip the rest of the explanation.

> The callers of map_user() give email and name to it, and expect to get an
> up-to-date versions of email and/or name to be used in their output. The
> function rewrites the given buffers in place. To optimize the majority of
> cases, the function returns 0 when it did not do anything, and it returns
> 1 when the caller should use the updated contents.
>
> The 'email' input to the function is terminated by '>' or a NUL (whichever
> comes first) for historical reasons, but when a rewrite happens, the value
> is replaced with the mailbox inside the <> pair.  However, it failed to
> meet this expectation when it only rewrote the name part without rewriting
> the email part, and the email in the input was terminated by '>'.

With the above explanation I suggested, I think this can be summarized as:

As of right now most of the callers of map_user() give a plan address,
and expect a plain address, however, 'git blame -e' passes along a
mail address that ends with >, and uses the return value to determine
if a transformation was made, and adds the missing '>'. This is
because when map_user() does the transformation, it returns a plan
address.

This might have worked in previous versions of map_user(), but now not
only does it transforms mail addresses, but also the name (phrase). So
now callers can't use the return value to know if they need to add an
extra '>', or not. So lets always remove the '>', so they can.

> This causes an extra '>' to appear in the output of "blame -e", because the
> caller does send in '>'-terminated email, and when the function returned 1
> to tell it that rewriting happened, it appends '>' that is necessary when
> the email part was rewritten.

It took too long to reach the problem. As a reader, that's the first
thing I would expect.

> The patch looks bigger than it actually is, because this change makes a
> variable that points at the end of the email part in the input 'p' live
> much longer than it used to, deserving a more descriptive name.

This is a *separate* logically independent patch. And you yourself
mention, makes the review of the patch more difficult, as of right now
it's difficult to spot the functional change, because it's mixed with
non-functional changes.

I find it peculiar how patches in the Linux kernel are truly logically
independent, but Git has a tendency to squash many things: cleanups,
fixes, documentation, tests, extra tests, remove unused code, etc.

Cheers.

-- 
Felipe Contreras
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html