Re: git error in tag ...: unterminated header

Junio C Hamano <gitster@xxxxxxxxx> · Sat, 27 Jun 2015 11:36:48 -0700

Johannes Schindelin <johannes.schindelin@xxxxxx> writes:

> On 2015-06-26 19:37, Junio C Hamano wrote:
>> Jeff King <peff@xxxxxxxx> writes:
>> 
>>> On Fri, Jun 26, 2015 at 10:06:20AM +0200, Johannes Schindelin wrote:
>>>
>>>> I understood what you were saying, but it still appears too fragile to
>>>> me to mix functions that assume NUL-terminated strings with an ad-hoc
>>>> counted string check.
>>>
>>> Yeah, I agree. It is not that you cannot make it safe, but that it is
>>> simply a fragile maintenance burden in the future. I thought we dealt
>>> with this already with a1e920a (index-pack: terminate object buffers
>>> with NUL, 2014-12-08), though.
>> 
>> Hmph, that is an interesting point.
>> 
>> It would mean that the require_eoh() can be reduced a bit further.
>> ...
>> That would mean the name of the helper needs to change, though.
>
> You mean in addition to your changes to read new lines only when we're
> still inside the buffer?

I think what Peff meant was that we always have the NUL at the end
of the buffer in a world with with a1e920a0 (index-pack: terminate
object buffers with NUL, 2014-12-08).

That means that we can replace require-eoh with verify-headers in
the message you are responding to and update the callers to call the
new function without doing anything else.

It might be tempting to say that require-eoh is not necessary, but I
think verify-headers still has its values. The running fsck may not
know some of the new headers the version of Git that produced the
object being verified knows; hence it is given that the line-by-line
verification does not check all the header lines individually.  But
at least we know that we know the header part of the object must be
terminated with LF and does not contain a NUL even for any new
header that will be invented in the future.  I.e. an object without
a body and ends with an incomplete line as the last line of the
header will not be allowed, ever.  And the only sane way to verify
that is by scanning the object upfront before we verify the known
ones line-by-line, just like we did with require-eoh.

As long as the code verifies line-by-line (which by the way the
"demotable error level" also depends on to allow it to re-sync to
the next header line after seeing an error in one header line; I do
not expect the line-by-line nature of verification to change in the
future for this reason), "make sure that the header part ends with
LF and before starting to read each header line to verify, make sure
we still have data to read" is not as fragile as you made it sound
in your past few messages, simply because there is no valid reason
to use start_with() with a string that has LF in the actual
verification code.  That would be only necessary if we have a known
header line that consists of a fixed token without any variable data
on it before the terminating LF, but in the context of talking about
Git object header, having such a header line is absurd in the first
place.

But with NUL termination guaranteed, we no longer need "before
starting to read each header, the size says we still have something
to read".  That is why I think updating require-eoh to verify-headers
is the only thing we need to do.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html