Re: [PATCH v2 01/10] Define a structure for object IDs.

Junio C Hamano <gitster@xxxxxxxxx> · Thu, 12 Mar 2015 11:24:32 -0700

Duy Nguyen <pclouds@xxxxxxxxx> writes:

> This may or may not fall into the "mix different hash functions"
> category. In pack files version 4, trees are encoded to point to other
> trees or blobs by a (pack, offset) tuple. It would be great if the new
> object_id could support carrying this kind of object id around because
> it could help reduce object lookup cost a lot. (pack, offset) can be
> converted back to SHA-1 so no info is lost and hashcmp() can compare
> (pack, tuple) against an SHA-1 just fine.

You mean "if it came in <pack, offset> format, convert it down to
<sha1> until the last second that it is needed (e.g. need to put
that in a tree object in order to compute the object name of the
containing tree object)"?

After converting an object name originally represented as <pack,
offset>, if we are doing the "union in struct" thing, to <sha1>
representation, you would have to look it up from .idx in order to
read the contents the usual way.  If that happens often enough, then
it may not be worth adding complexity to the code to carry the
<pack, offset> pair around.

Unless you fix that "union in struct" assumption, that is.

To me, <pack, offset> information smells to belong more to a "struct
object" (or its subclass) as an optional annotation---when a caller
is asked to parse_object(), you would bypass the sha1_read_file()
that goes and looks the object name up from the list of pack .idx
and instead go there straight using that annotation.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html