Re: [PATCH 3/7] pack-protocol.txt: Mark all LFs in push-cert as required

Junio C Hamano <gitster@xxxxxxxxx> · Mon, 06 Jul 2015 09:12:39 -0700

Dave Borowitz <dborowitz@xxxxxxxxxx> writes:

> Unfortunately, optional LFs still make the stored certs for later
> auditing and parsing a bit illegible. This is one way in which push
> certs are fundamentally different from the rest of the wire protocol,
> which is not intended to be persisted.

Hmm, I am not sure I follow.  

> The corner case I pointed out before where nonce runs into commands is
> not the only one.
>
> Consider the following cert fragment:
> 001fpushee git://localhost/repo
> 0029nonce 1433954361-bde756572d665bba81d8
>
> A naive cert storage/auditing implementation would store the raw
> payload that needs to be verified, without the pkt-line framing. In
> this case:
> pushee git://localhost/repononce 1433954361-bde756572d665bba81d8

"Without the pkt-line framing" is fine, but my understanding (or,
the intention of the original implementor) of this part of the
protocol is that "packets between the push-cert packet and the
push-cert-end packet carry the meat of the each line of the
certificate, one packet per line".

If pkt-line is allowed to omit the terminating LFs, then it follows
that the receiving ends can simply do something like what I
illustrated in $gmane/273196 (in java or whatever other
implementation platform they use) to collect packets between
"push-cert" and "push-cert-end", knowing that the packets may or may
not have terminating LF and supplying the omitted LFs themselves
when they receive the cert before verifying and storing.

So in order to reconstitute the "raw payload without pkt-line framing",
the omitted LF obviously needs to be added.  Why is that a problem?

    Side note: think of it in a different way.  The key word of the
    first paragraph above is "the meat of"; if your cert has two
    lines

    	"pushee $URL<LF>nonce 1234-5670<LF>"

    the lines in it are "pushee $URL<LF>" and "nonce 1234-5670<LF>"
    but the meat of them are "pushee $URL" and "nonce 1234-5670".

    The protocol wants to carry an array with two elements, ("pushee
    $URL", "nonce 1234-5670"), as the hypothetical cert has two
    lines.  And then "\n".join(the cert array) . "\n" would be how
    you reconstruct the original payload.

    The illustration in $gmane/273196 is slightly cheating in that
    sense.  Instead of first creating an array of plain strings
    without LF termination and joining them together later, it knows
    that we will LF-join in the end, and abuses the LF in the
    original payload that came from the sender and supplies its own
    if the sender omitted it.

It is very similar to and in the opposite of how each ref advertisement
is handled.  Until the first flush, each packet is expected to carry
the object name and the ref name.  A pkt-line framing may add
terminating LF but that obviously is not part of the ref name.

> A naive parser that wants to find the pushee would look for "pushee
> <urlish>", which would be wrong in this case. (To say nothing of the
> fact that "pushee" might actually be "-0700pushee".)
>
> The alternatives for someone writing a parser are:
> a. Store the original pkt-line framing.
> b. Write a parser in some other clever way, e.g. parsing the entire
> cert in reverse might work.
>
> Neither of these is very satisfying, and both reduce human legibility
> of the stored payload.
>
>>> > If LF is optional, then with that approach you might end up with a
>>> > section of that buffer like:
>>>
>>> I think I touched on this in my previous message.  You cannot send
>>> an empty line anywhere, and this is not limited to push-cert section
>>> of the protocol.  Strictly speaking, the wire level allows it, but I
>>> do not think the deployed client APIs easily lets you deal with it.
>>>
>>> So you must follow the "SHOULD terminate with LF" for an empty line,
>>> even when you choose to ignore the "SHOULD" in most other places.
>>>
>>> I do not think it is such a big loss, as long as it is properly
>>> documented.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html