Re: [PATCH v3 2/3] sha1_file: implement changes for "cat-file --literally -t"

Junio C Hamano <gitster@xxxxxxxxx> · Sun, 08 Mar 2015 12:09:48 -0700

karthik nayak <karthik.188@xxxxxxxxx> writes:

> Sorry for the confusion, you did already say that in $gmane/264955 , I'm
> talking about how I tackled the issue in $gmane/264855.

Well, I am suggesting how to improve what you did in your
$gmane/264855 and a part of that was to suggest that teaching
parse_sha1_header() the "literally" mode may be necessary and why
"do not have to call parse_sha1_header()" may not be a good
solution.

Unless our goal is only to support "--literally -t" and go home,
never intending to support other things like "--literally -s" and
"--literally flob $OBJ", that is ;-)

Let's try again.

> Like :
>
>     else if ((flags & LOOKUP_LITERALLY)) {
>         size_t typelen = strcspn(hdrbuf.buf, " ");
>         strbuf_add(oi->typename, hdrbuf.buf, typelen);
>     }
>     else if ((status = parse_sha1_header(hdrp, &size)) < 0)
>         status = error("unable to parse %s header", sha1_to_hex(sha1));
>     else if (oi->sizep)
>         *oi->sizep = size;
>
> This way, we don't have to modify parse_sha1_header() to worry if "literally"
> is set or not.

When you are dealing with a crafted object $OBJ of type "florb", how
would your

    $ git cat-file --literally florb $OBJ
    $ git cat-file --literally -s $OBJ

be implemented, if parse_sha1_header() has not been enhanced to
allow you to say "for this invocation, the type name in the object
header may be something unknown to us, and it is OK"?

One possible implementation of "--literally florb $OBJ" would be to
still call the same loose object reading codepath, which would
eventually hit parse_sha1_header() to see what the type of the
object is and how long the object contents is, and error out if the
type is not "florb" or the length of the inflated contents does not
match the expected size.  Wouldn't you need a way for you to say
"for this invocation, the type name in the object header may be
something unknown to us, and it is OK"?

One possible implementation of "--literally -s $OBJ" would be to
change the call to sha1_object_info() to instead to call the
_extended() version, just like you did for "--literally -t", and
then read the result from *(oi.sizep), but the quoted piece of code
above would not let you use it that way, no?

Of course the above two are both "one possible implementation", and
not how the implementation ought to be [*1*].  But knowing a bit of
this part of the codepath, they look the most natural way to enhance
these codepaths, at least to me.

[Footnote]

*1* You could for example sidestep the issue and introduce a
parallel implementation of what parse_sha1_header() does, minus the
validation to ensure the object type is one of the known ones, and
use that function instead of the users of parse_sha1_header() in the
codepaths that implement "-s" and "cat-file" itself.

But I do not think that is a good direction to go in to keep the
codebase healthy in the longer term.  A refactoring that is similar
to what we did when we introduced sha1_object_info_extended(), which
was done in 9a490590 (sha1_object_info_extended(): expose a bit more
info, 2011-05-12) would be more desirable, so that we can avoid such
a duplicated parallel implementations.  That way, the existing
callers that do not have to know about "--literally" can keep using
parse_sha1_header(), but parse_sha1_header_extended() is called from
the updated parse_sha1_header() behind the scene.  And the extended
version would let callers that need to care about "--literally" ask
"the type name in the object header may be something unknown to us,
and it is OK" by passing an extra flag.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html