Re: [PATCH 2/2] lookup_commit_reference_gently: do not read non-{tag,commit}

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thomas Rast wrote:
> diff --git a/commit.c b/commit.c
> index 888e02a..00e8d4a 100644
> --- a/commit.c
> +++ b/commit.c
> @@ -31,8 +31,12 @@ static struct commit *check_commit(struct object *obj,
>  struct commit *lookup_commit_reference_gently(const unsigned char *sha1,
>                                               int quiet)
>  {
> -       struct object *obj = deref_tag(parse_object(sha1), NULL, 0);
> -
> +       struct object *obj;
> +       int type = sha1_object_info(sha1, NULL);
> +       /* If it's neither tag nor commit, parsing the object is wasted effort */
> +       if (type != OBJ_TAG && type != OBJ_COMMIT)
> +               return NULL;
> +       obj = deref_tag(parse_object(sha1), NULL, 0);
>         if (!obj)
>                 return NULL;
>         return check_commit(obj, sha1, quiet);

As Jeff points out, you've introduced an extra sha1_object_info() call
in the common case of tag (which derefs into a commit anyway) and
commit slowing things down.

So, my main doubt centres around how sha1_object_info() determines the
type of the object without actually parsing it.  You have to open up
the file and look at the fields near the top, no? (or fallback to blob
failing that).  I am reading it:

1. It calls sha1_loose_object_info() or sha1_packed_object_info(),
depending on whether the particular file is in-pack or not.  Lets see
what is common between them.

2. The loose counterpart seems to call unpack_sha1_header() after
mmap'ing the file.  This ultimately ends up calling
unpack_object_header_buffer(), which is also what the packed
counterpart calls.

3. I didn't understand what unpack_object_header_buffer() is doing.
And'ing with some magic 0x80 and shifting by 4 bits iteratively? type
= (c >> 4) & 7?

In contrast, parse_object() first calls lookup_object() to look it up
in some hashtable to get the type -- the packfile idx, presumably?
Why don't you also do that instead of sha1_object_info()?  Or, why
don't you wrap parse_object() in an API that doesn't go beyond the
first blob check (and not execute parse_object_buffer())?

Also, does this patch fix the bug Alex reported?

Apologies if I've misunderstood something horribly (which does seem to
be the case).

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]