Re: [PATCH v2 06/11] commit-graph: examine commits by generation number

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2/18/2020 7:32 PM, Jakub Narebski wrote:
> "Derrick Stolee via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes:
> 
>> From: Derrick Stolee <dstolee@xxxxxxxxxxxxx>
>>
>> When running 'git commit-graph write --changed-paths', we sort the
>> commits by pack-order to save time when computing the changed-paths
>> bloom filters. This does not help when finding the commits via the
>> --reachable flag.
> 
> Minor improvement suggestion: s/--reachable flag/'--reachable' flag/.
> 

Sure. 


>>                     Commits with similar generation are more likely
>> to have many trees in common, making the diff faster.
> 
> Is this what causes the performance improvement, that subsequently
> examined commits are more likely to have more trees in common, which
> means that those trees would be hot in cache, making generating diff
> faster?  Is it what profiling shows?
> 

Yes. 

>>
>> Helped-by: Jeff King <peff@xxxxxxxx>
>> Signed-off-by: Derrick Stolee <dstolee@xxxxxxxxxxxxx>
>> Signed-off-by: Garima Singh <garima.singh@xxxxxxxxxxxxx>
>> ---
>>  commit-graph.c | 33 ++++++++++++++++++++++++++++++---
>>  1 file changed, 30 insertions(+), 3 deletions(-)
>>
>> diff --git a/commit-graph.c b/commit-graph.c
>> index e125511a1c..32a315058f 100644
>> --- a/commit-graph.c
>> +++ b/commit-graph.c
>> @@ -70,6 +70,25 @@ static int commit_pos_cmp(const void *va, const void *vb)
>>  	       commit_pos_at(&commit_pos, b);
>>  }
>>  
>> +static int commit_gen_cmp(const void *va, const void *vb)
>> +{
>> +	const struct commit *a = *(const struct commit **)va;
>> +	const struct commit *b = *(const struct commit **)vb;
>> +
>> +	/* lower generation commits first */
> 
> Shouldn't higher generation commits come first, in recency-like order?
> Or it doesn't matter if it is sorted in ascending or descending order,
> as long as commits with close generation numbers are examined close
> together?
> 

The direction does not matter. Locality is important. 

>> +	if (a->generation < b->generation)
>> +		return -1;
>> +	else if (a->generation > b->generation)
>> +		return 1;
>> +
>> +	/* use date as a heuristic when generations are equal */
>> +	if (a->date < b->date)
>> +		return -1;
>> +	else if (a->date > b->date)
>> +		return 1;
>> +	return 0;
>> +}
> 
> I thought we have had such comparison function defined somewhere in Git
> already, but I think I'm wrong here.
> 

It actually exists in commit.h
I will just use it here. 
Thanks for pointing it out! 

>> +
>>  char *get_commit_graph_filename(const char *obj_dir)
>>  {
>>  	char *filename = xstrfmt("%s/info/commit-graph", obj_dir);
>> @@ -821,7 +840,8 @@ struct write_commit_graph_context {
>>  		 report_progress:1,
>>  		 split:1,
>>  		 check_oids:1,
>> -		 changed_paths:1;
>> +		 changed_paths:1,
>> +		 order_by_pack:1;
>>  
>>  	const struct split_commit_graph_opts *split_opts;
>>  	uint32_t total_bloom_filter_data_size;
>> @@ -1184,7 +1204,11 @@ static void compute_bloom_filters(struct write_commit_graph_context *ctx)
>>  
>>  	ALLOC_ARRAY(sorted_by_pos, ctx->commits.nr);
>>  	COPY_ARRAY(sorted_by_pos, ctx->commits.list, ctx->commits.nr);
>> -	QSORT(sorted_by_pos, ctx->commits.nr, commit_pos_cmp);
>> +
>> +	if (ctx->order_by_pack)
>> +		QSORT(sorted_by_pos, ctx->commits.nr, commit_pos_cmp);
>> +	else
>> +		QSORT(sorted_by_pos, ctx->commits.nr, commit_gen_cmp);
> 
> Here 'sorted_b_pos' variable name no longer reflects reality...
> (see comment to the previous patch in the series).
> 

Yup. Fixing. 

Thanks!
Garima Singh



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux