[PATCHv3 3/3] gitweb: Faster project search

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Before searching by some field the information we search for must be
filled in, but we do not have to fill other fields that are not
involved in the search.

To be able to request filling only specified fields,
fill_project_list_info() was enhanced in previous commit to take
additional parameters which specify part of projects info to fill.
This way we can limit doing expensive calculations (like running
git-for-each-ref to get 'age' / "Last changed" info) to doing those
only for projects which we will show as search results.

This commit actually uses this interface, changing gitweb code from
the following behavior

  fill all project info on all projects
  search projects

to behaving like this pseudocode

  fill search fields on all projects
  search projects
  fill all project info on search results


With this commit the number of git commands used to generate search
results is 2*<matched projects> + 1, and depends on number of matched
projects rather than number of all projects (all repositories).

Note: this is 'git for-each-ref' to find last activity, and 'git config'
for each project, and 'git --version' once.


Example performance improvements, for search that selects 2
repositories out of 12 in total:

* Before (warm cache):
  "This page took 0.867151 seconds  and 27 git commands to generate."

* After (warm cache):
  "This page took 0.673643 seconds  and 5 git commands to generate."

Now imagine that they are 5 repositories out of 5000, and cold or
trashed cache case.

Signed-off-by: Jakub Narebski <jnareb@xxxxxxxxx>
---
Changes from v2 (v2 is the same as v1):
* Rewritten and extended commit message (though extending perhaps have
  gone too far...).

  Added paragraphs are "This commit actually uses this interface..."
  and "Example performance improvements..."


Junio C Hamano wrote in
http://thread.gmane.org/gmane.comp.version-control.git/190852/focus=191053
[...]
JC> "must be filled in." is correct, but that happens without the previous
JC> patch.  The first sentence must also say:
JC> 
JC>         In order to search by some field, the information we look for must
JC>         be filled in, but we do not have to fill other fields that are not
JC>         involved in the search.
JC> 
JC> to justify the previous "fill_project_list_info can be asked to return
JC> without getting unused fields" patch.  

Done.  Thanks for a suggestion.

JC> The rest of the log message then makes good sense.

Ooops...

 gitweb/gitweb.perl |    9 +++++++--
 1 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 7fb7a55..4ceb1a6 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -2987,6 +2987,10 @@ sub search_projects_list {
 	return @$projlist
 		unless ($tagfilter || $searchtext);
 
+	# searching projects require filling to be run before it;
+	fill_project_list_info($projlist,
+	                       $tagfilter  ? 'ctags' : (),
+	                       $searchtext ? ('path', 'descr') : ());
 	my @projects;
  PROJECT:
 	foreach my $pr (@$projlist) {
@@ -5394,12 +5398,13 @@ sub git_project_list_body {
 	# filtering out forks before filling info allows to do less work
 	@projects = filter_forks_from_projects_list(\@projects)
 		if ($check_forks);
-	@projects = fill_project_list_info(\@projects);
-	# searching projects require filling to be run before it
+	# search_projects_list pre-fills required info
 	@projects = search_projects_list(\@projects,
 	                                 'searchtext' => $searchtext,
 	                                 'tagfilter'  => $tagfilter)
 		if ($tagfilter || $searchtext);
+	# fill the rest
+	@projects = fill_project_list_info(\@projects);
 
 	$order ||= $default_projects_order;
 	$from = 0 unless defined $from;
-- 
1.7.9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]