On Wed, Jun 26, 2013 at 12:48 AM, Junio C Hamano <gitster@xxxxxxxxx> wrote: > After this, the function returns. The original did not add to the > table the object name we are looking at, but the new code first adds > it to the table with the unconditional kh_put_sha1() above. Is a > call to kh_del_sha1() missing here ... No, this is not the case. That's the return case for when *the object was found because it already existed in the hash table* (hence we access it if we're excluding it, to tag it as excluded). We don't want to remove it from the hash table because we're not the ones we inserted it. We only call `kh_del_sha1` in the cases where: 1. The object wasn't found. 2. We inserted its key on the hash table. 3. We later learnt that we don't really want to pack this object. > >> @@ -921,38 +916,42 @@ static int add_object_entry(const unsigned char *sha1, enum object_type type, >> return 0; >> } >> >> - if (!exclude && local && has_loose_object_nonlocal(sha1)) >> + if (!exclude && local && has_loose_object_nonlocal(sha1)) { >> + kh_del_sha1(packed_objects, ix); >> return 0; > > ... like this one, which seems to compensate for "ahh, after all we > realize we do not want to add this one to the table"? > >> @@ -966,19 +965,30 @@ static int add_object_entry(const unsigned char *sha1, enum object_type type, >> entry->in_pack_offset = found_offset; >> } >> >> - if (object_ix_hashsz * 3 <= nr_objects * 4) >> - rehash_objects(); >> - else >> - object_ix[-1 - ix] = nr_objects; >> + kh_value(packed_objects, ix) = entry; >> + kh_key(packed_objects, ix) = entry->idx.sha1; >> + objects[nr_objects++] = entry; >> >> display_progress(progress_state, nr_objects); >> >> - if (name && no_try_delta(name)) >> - entry->no_try_delta = 1; >> - >> return 1; >> } >> >> +static int add_object_entry(const unsigned char *sha1, enum object_type type, >> + const char *name, int exclude) >> +{ >> + if (add_object_entry_1(sha1, type, name_hash(name), exclude, NULL, 0)) { >> + struct object_entry *entry = objects[nr_objects - 1]; >> + >> + if (name && no_try_delta(name)) >> + entry->no_try_delta = 1; >> + >> + return 1; >> + } >> + >> + return 0; >> +} > > It is somewhat unclear what we are getting from the split of the > main part of this function into *_1(), other than the *_1() function > now has a very deep indentation inside "if (!found_pack)", which is > always true because the caller always passes NULL to found_pack. > Perhaps this is an unrelated refactoring that is needed for later > steps and does not have anything to do with the use of new hash > function? Yes, apologies for not making this clear. By refactoring into `_1`, you can see how `traverse_bitmap_commit_list` can use the `_1` version directly as a callback, to insert objects straight into the packing list without looking them up. This is very efficient because we can pass the whole API straight from the bitmap code: 1. The SHA1: we find it by simply looking up the `nth` sha1 on the pack index (if we are yielding bit `n`) 2. The object type: we find it because we have type indexes that let us know the type of any given bit in the bitmap by and-ing it with the index. 3. The hash for its name: we can look it up from the name hash cache in the new bitmap format. 4. Exclude flag: we never exclude when working with bitmaps 5. found_pack: all the bitmapped objects come from the same pack! 6. found_offset: we find it by simply looking up the `nth` offset on the pack index (if we are yielding bit `n`) Boom! We filled the callback just from the data in a bitmap. Ain't that nice? Let me amend the commit message. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html