Re: [PATCH v2] gpg-interface: strip CR chars for Windows gpg2

Eric Sunshine <sunshine@xxxxxxxxxxxxx> · Mon, 13 Nov 2017 05:24:14 -0500

On Sun, Nov 12, 2017 at 9:08 PM, Junio C Hamano <gitster@xxxxxxxxx> wrote:
> Eric Sunshine <sunshine@xxxxxxxxxxxxxx> writes:
>>> +       size_t i, j;
>>> +       for (i = j = bottom; i < buffer->len; i++)
>>> +               if (!(i < buffer->len - 1 &&
>>> +                               buffer->buf[i] == '\r' &&
>>> +                               buffer->buf[i + 1] == '\n')) {
>>
>> Hmm, was this tested? If I'm reading this correctly, this strips out
>> the entire CRLF pair, whereas the original code only stripped the CR
>> and left what followed it (typically LF) alone. Junio's suggestion was
>> to enhance this to be more careful and strip CR only when followed
>> immediately by LF (but to leave the LF intact). Therefore, this seems
>> like a regression.
>>
>>> +                       if (i != j)
>>> +                               buffer->buf[j] = buffer->buf[i];
>>> +                       j++;
>>> +               }
>
> I think the "negate the entire thing" condition confuses the
> readers.  The negated condition is "Do we still have enough bytes to
> see if we are looking at CRLF?  Are we at CR?  Is the one byte
> beyond what we have a LF?  Do all of these three conditions hold
> true?"  If not, i.e. for all the bytes on the line except for that
> narrow "we are at CR of a CRLF sequence" case, the body of the loop
> makes a literal copy and advances the destination pointer 'j'.  The
> only thing that is skipped is CR that comes at the beginning of a
> CRLF sequence.
>
> So I think the loop does what it wants to do.

You're right. I misunderstood it. Rephrasing the logic in a simpler
fashion would help.

> In any case, I think this should be a two patch series---one with
> the code as-is with a better explanation, but without "make sure
> only CR in CRLF and no other CR are stripped" improvement, and the
> other as a follow-up to it to make the improvement.

Agreed, a 2-patch series makes sense.