Re: [WIP-PATCH 0/2] send-email: refactor the email parser loop

Eric Wong <e@xxxxxxxxx> · Fri, 27 May 2016 20:14:36 +0000

Samuel GROOT <samuel.groot@xxxxxxxxxxxxxxxx> wrote:
> While working on the new option `quote-email`, we needed to parse an
> email file. Since the work is already done, but the parsing and data
> processing are in the same loop, we wanted to refactor the parser, to
> make the code more maintainable.

Thank you for doing this work :)

> This is still WIP, and one of the main issue (and we need your
> advice on that), is that around 30 tests will fail, and most of them
> are false-negatives: to pass, they rely on the format of what is
> displayed by `git send-email`, not only data.
> 
> 
> For example, several tests will fail because they do a strict compare
> between `git send-email`'s output and:
> 
>    (mbox) Adding cc: A<author@xxxxxxxxxxx>  from line 'Cc: A<author@xxxxxxxxxxx>, One<one@xxxxxxxxxxx>'
>    (mbox) Adding cc: One<one@xxxxxxxxxxx>  from line 'Cc: A<author@xxxxxxxxxxx>, One<one@xxxxxxxxxxx>'
> 
> Though `git send-email` now outputs something like:
> 
>    (mbox) Adding cc: A<author@xxxxxxxxxxx>  from line 'Cc: A<author@xxxxxxxxxxx>'
>    (mbox) Adding cc: One<one@xxxxxxxxxxx>  from line 'Cc: One<one@xxxxxxxxxxx>'
I actually like neither, and would prefer something shorter:

    (mbox) Adding cc: A <author@xxxxxxxxxxx> from Cc: header
    (mbox) Adding cc: One <one@xxxxxxxxxxx> from Cc: header
    (mbox) Adding cc: SoB <SoB@xxxxxxxxxxx> from Signed-off-by: trailer

That way, there's no redundant addresses in each line and less
likely to wrap.

But I actually never noticed these lines in the past since they
scrolled off my terminal :x

Since the headers are already shown after those lines, it's
redundant to have the entire line.  And we could add
trailers after the headers (with a blank line to delimit):

    # existing header output:
    From: F <F@xxxxxxxxxxx>
    Cc: A <author@xxxxxxxxxxx>, One <one@xxxxxxxxxxx>
    Subject: foo

    # new trailer output
    Signed-off-by: SoB <SoB@xxxxxxxxxxx>
    Acked-by: ack <ack@xxxxxxxxxxx>

> We can think of several solutions:
> 
>    1. Modify the parser, to give the script the knowledge of the exact
>       line the data came from.
> 
>    2. Hack the tests: modify the script using the parser to artificially
>       recreate the supposedly parsed line.
>       (e.g. with a list.join(', ')-like function)
> 
>    3. Modify the tests to replace exact cmp by greps.
> 
> 
> IMHO, we should consider option 3, since the tests should rely on data
> rather than exact outputs. It also makes the tests more maintainable,
> in the sense that they will be more resilient to future output
> modifications.

Agreed on 3.

I am not sure if anybody outside of tests parses the stdout of
send-email.  It's certainly a porcelain and I don't think
stdout needs to be stable, and maybe the output in
question should go to stderr since it could be considered
debug output.

But I could be wrong...
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html