Re: [PATCH 03/16] pack-objects: use a faster hash table

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Vicent Marti <tanoku@xxxxxxxxx> writes:

> @@ -901,19 +896,19 @@ static int no_try_delta(const char *path)
>  	return 0;
>  }
>  
> -static int add_object_entry(const unsigned char *sha1, enum object_type type,
> -			    const char *name, int exclude)
> +static int add_object_entry_1(const unsigned char *sha1, enum object_type type,
> +			    uint32_t hash, int exclude, struct packed_git *found_pack,
> +				off_t found_offset)
>  {
>  	struct object_entry *entry;
> -	struct packed_git *p, *found_pack = NULL;
> -	off_t found_offset = 0;
> -	int ix;
> -	unsigned hash = name_hash(name);
> +	struct packed_git *p;
> +	khiter_t ix;
> +	int hash_ret;
>  
> -	ix = nr_objects ? locate_object_entry_hash(sha1) : -1;
> -	if (ix >= 0) {
> +	ix = kh_put_sha1(packed_objects, sha1, &hash_ret);
> +	if (hash_ret == 0) {
>  		if (exclude) {
> -			entry = objects + object_ix[ix] - 1;
> +			entry = kh_value(packed_objects, ix);
>  			if (!entry->preferred_base)
>  				nr_result--;
>  			entry->preferred_base = 1;

After this, the function returns.  The original did not add to the
table the object name we are looking at, but the new code first adds
it to the table with the unconditional kh_put_sha1() above.  Is a
call to kh_del_sha1() missing here ...

> @@ -921,38 +916,42 @@ static int add_object_entry(const unsigned char *sha1, enum object_type type,
>  		return 0;
>  	}
>  
> -	if (!exclude && local && has_loose_object_nonlocal(sha1))
> +	if (!exclude && local && has_loose_object_nonlocal(sha1)) {
> +		kh_del_sha1(packed_objects, ix);
>  		return 0;

... like this one, which seems to compensate for "ahh, after all we
realize we do not want to add this one to the table"?

> @@ -966,19 +965,30 @@ static int add_object_entry(const unsigned char *sha1, enum object_type type,
>  		entry->in_pack_offset = found_offset;
>  	}
>  
> -	if (object_ix_hashsz * 3 <= nr_objects * 4)
> -		rehash_objects();
> -	else
> -		object_ix[-1 - ix] = nr_objects;
> +	kh_value(packed_objects, ix) = entry;
> +	kh_key(packed_objects, ix) = entry->idx.sha1;
> +	objects[nr_objects++] = entry;
>  
>  	display_progress(progress_state, nr_objects);
>  
> -	if (name && no_try_delta(name))
> -		entry->no_try_delta = 1;
> -
>  	return 1;
>  }
>  
> +static int add_object_entry(const unsigned char *sha1, enum object_type type,
> +			    const char *name, int exclude)
> +{
> +	if (add_object_entry_1(sha1, type, name_hash(name), exclude, NULL, 0)) {
> +		struct object_entry *entry = objects[nr_objects - 1];
> +
> +		if (name && no_try_delta(name))
> +			entry->no_try_delta = 1;
> +
> +		return 1;
> +	}
> +
> +	return 0;
> +}

It is somewhat unclear what we are getting from the split of the
main part of this function into *_1(), other than the *_1() function
now has a very deep indentation inside "if (!found_pack)", which is
always true because the caller always passes NULL to found_pack.
Perhaps this is an unrelated refactoring that is needed for later
steps and does not have anything to do with the use of new hash
function?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]