Re: [PATCH v8 4/5] Implement line-history search (git log -L)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thomas Rast <trast@xxxxxxxxxxxxxxx> writes:

> Junio C Hamano <gitster@xxxxxxxxx> writes:
>
>> I notice that "careful and slow" is just "too slow
>> to be usable" even on a small tree like ours.  Try running
>>
>>     $ git log -M -L:get_name:builtin/describe.c
>>
>> and see how long you have to wait until you hit the first line of
>> output.
>
> I'll dig some more.  It *should* be essentially the following times
> taken together:
>
>   $ time git log --raw -M --topo-order >/dev/null
>   real    0m5.448s
[...]
>   $ time git log -L:get_name:builtin/describe.c >/dev/null
>   real    0m0.832s
[...]
>   $ time git log -L:get_name:builtin-describe.c 81b50f3ce40^ >/dev/null
>   real    0m0.489s
[...]
> So I'm losing a factor of about 4 somewhere, which I can't explain right
> now.

It seems I still don't understand half of this code.

log -M --raw in the above somehow appears to use and optimize for
-M100%, whereas the log -L code is currently written for general args.

However, I couldn't pin down where this happens; I only know from call
graph profiling[1] that log -M --raw never goes through diffcore_std().
And indeed according to the same sort of profiling, log -M -L spends
most of its time within diffcore_std() unpacking blobs to find renames.

Even more confusingly, try_to_follow_renames() _does_ call into
diffcore_std, so there seems to be some merit to calling it after all.

With the hacky patch below,  I get something more reasonable:

  $ time ./git-log -M -L:get_name:builtin/describe.c  >/dev/null

  real    0m3.794s
  user    0m3.734s
  sys     0m0.045s

That's compared to about 35s on my machine without the patch.  It still
calls diffcore_std(), but before that it discards all diff pairs except
those affecting the path(s) we're interested in and any deletions (so
that they can be used as rename sources).  After diffcore_std() it
discards all pairs that we're not interested in.

[1]  valgrind --tool=callgrind --trace-children=yes

-- >8 --
Subject: [PATCH] WIP: speed up log -L... -M

---
 line-log.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 54 insertions(+), 4 deletions(-)

diff --git a/line-log.c b/line-log.c
index a74bbaf..b03cc0b 100644
--- a/line-log.c
+++ b/line-log.c
@@ -997,7 +997,52 @@ static void move_diff_queue(struct diff_queue_struct *dst,
 	DIFF_QUEUE_CLEAR(src);
 }
 
-static void queue_diffs(struct diff_options *opt,
+static void filter_diffs_for_paths(struct line_log_data *range, int keep_deletions)
+{
+	int i;
+	struct diff_queue_struct outq;
+	DIFF_QUEUE_CLEAR(&outq);
+
+	/* fprintf(stderr, "-- filtering:\n"); */
+
+	for (i = 0; i < diff_queued_diff.nr; i++) {
+		struct diff_filepair *p = diff_queued_diff.queue[i];
+		struct line_log_data *rg = NULL;
+		/* fprintf(stderr, "%-38s\t%-38s\n", (p->one ? p->one->path : "<none>"), (p->two ? p->two->path : "<none>")); */
+		if (!DIFF_FILE_VALID(p->two)) {
+			if (keep_deletions)
+				diff_q(&outq, p);
+			else
+				diff_free_filepair(p);
+			continue;
+		}
+		for (rg = range; rg; rg = rg->next) {
+			if (!strcmp(rg->spec->path, p->two->path))
+				break;
+		}
+		if (rg)
+			diff_q(&outq, p);
+		else
+			diff_free_filepair(p);
+	}
+	free(diff_queued_diff.queue);
+	diff_queued_diff = outq;
+}
+
+static inline int diff_might_be_rename(void)
+{
+	int i;
+	for (i = 0; i < diff_queued_diff.nr; i++)
+		if (!DIFF_FILE_VALID(diff_queued_diff.queue[i]->one)) {
+			/* fprintf(stderr, "diff_might_be_rename found creation of: %s\n", */
+			/* 	diff_queued_diff.queue[i]->two->path); */
+			return 1;
+		}
+	return 0;
+}
+
+static void queue_diffs(struct line_log_data *range,
+			struct diff_options *opt,
 			struct diff_queue_struct *queue,
 			struct commit *commit, struct commit *parent)
 {
@@ -1013,7 +1058,12 @@ static void queue_diffs(struct diff_options *opt,
 
 	DIFF_QUEUE_CLEAR(&diff_queued_diff);
 	diff_tree(&desc1, &desc2, "", opt);
-	diffcore_std(opt);
+	if (opt->detect_rename) {
+		filter_diffs_for_paths(range, 1);
+		if (diff_might_be_rename())
+			diffcore_std(opt);
+		filter_diffs_for_paths(range, 0);
+	}
 	move_diff_queue(queue, &diff_queued_diff);
 
 	if (tree1)
@@ -1297,7 +1347,7 @@ static int process_ranges_ordinary_commit(struct rev_info *rev, struct commit *c
 	if (commit->parents)
 		parent = commit->parents->item;
 
-	queue_diffs(&rev->diffopt, &queue, commit, parent);
+	queue_diffs(range, &rev->diffopt, &queue, commit, parent);
 	changed = process_all_files(&parent_range, rev, &queue, range);
 	if (parent)
 		add_line_range(rev, parent, parent_range);
@@ -1322,7 +1372,7 @@ static int process_ranges_merge_commit(struct rev_info *rev, struct commit *comm
 	for (i = 0; i < nparents; i++) {
 		parents[i] = p->item;
 		p = p->next;
-		queue_diffs(&rev->diffopt, &diffqueues[i], commit, parents[i]);
+		queue_diffs(range, &rev->diffopt, &diffqueues[i], commit, parents[i]);
 	}
 
 	for (i = 0; i < nparents; i++) {
-- 
1.8.2.rc1.391.g6a988e5.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]