[PATCH v6 00/10] commit-graph write: progress output improvements

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Improvements since v6:

 * Integrate my "commit-graph write: use pack order when finding
   commits" patch, and per Junio's suggestion put it at the start so
   it's easier to split the two apart.

 * A signed-off-by of mine was missing on that patch. Fixed.

 * Replace SZEDER's two patches with his re-roll at
   https://public-inbox.org/git/20190118170549.30403-1-szeder.dev@xxxxxxxxx/
   for convenienc, since mine needed rebasing on top of his.

 * SZEDER rightly pointed out that I was doing some work for nothing
   in https://public-inbox.org/git/20190118171605.GM840@xxxxxxxxxx/;
   fixed.

SZEDER Gábor (2):
  commit-graph: rename "large edges" to "extra edges"
  commit-graph: don't call write_graph_chunk_large_edges() unnecessarily

Ævar Arnfjörð Bjarmason (8):
  commit-graph write: use pack order when finding commits
  commit-graph write: add "Writing out" progress output
  commit-graph write: more descriptive "writing out" output
  commit-graph write: show progress for object search
  commit-graph write: add more descriptive progress output
  commit-graph write: remove empty line for readability
  commit-graph write: add itermediate progress
  commit-graph write: emit a percentage for all progress

 .../technical/commit-graph-format.txt         |   4 +-
 builtin/commit-graph.c                        |   4 +-
 commit-graph.c                                | 136 +++++++++++++-----
 commit-graph.h                                |   2 +-
 t/t5318-commit-graph.sh                       |  14 +-
 5 files changed, 110 insertions(+), 50 deletions(-)

Range-diff:
 1:  07d06c50c0 <  -:  ---------- commit-graph: rename 'num_extra_edges' variable to 'num_large_edges'
 -:  ---------- >  1:  9ca0b5573b commit-graph write: use pack order when finding commits
 -:  ---------- >  2:  4370ea7591 commit-graph: rename "large edges" to "extra edges"
 2:  904dda1e7a !  3:  3468e30c89 commit-graph: don't call write_graph_chunk_large_edges() unnecessarily
    @@ -10,19 +10,19 @@
         there won't be a 'Large Edge List' chunk written, only the three
         mandatory chunks.
     
    -    However, when it comes to writing chunk data, write_commit_graph()
    -    unconditionally invokes write_graph_chunk_large_edges(), even when it
    -    was decided earlier that that chunk won't be written.  Strictly
    -    speaking there is no bug here, because write_graph_chunk_large_edges()
    -    won't write anything because it won't find any commits with more than
    -    two parents, but then it unnecessarily and in vain looks through all
    -    commits once again in search for such commits.
    +    However, when it later comes to writing actual chunk data,
    +    write_commit_graph() unconditionally invokes
    +    write_graph_chunk_large_edges(), even when it was decided earlier that
    +    that chunk won't be written.  Strictly speaking there is no bug here,
    +    because write_graph_chunk_large_edges() won't write anything if it
    +    doesn't find any commits with more than two parents, but then it
    +    unnecessarily and in vain looks through all commits once again in
    +    search for such commits.
     
         Don't call write_graph_chunk_large_edges() when that chunk won't be
         written to spare an unnecessary iteration over all commits.
     
         Signed-off-by: SZEDER Gábor <szeder.dev@xxxxxxxxx>
    -    Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx>
     
      diff --git a/commit-graph.c b/commit-graph.c
      --- a/commit-graph.c
    @@ -31,9 +31,9 @@
      	write_graph_chunk_fanout(f, commits.list, commits.nr);
      	write_graph_chunk_oids(f, GRAPH_OID_LEN, commits.list, commits.nr);
      	write_graph_chunk_data(f, GRAPH_OID_LEN, commits.list, commits.nr);
    --	write_graph_chunk_large_edges(f, commits.list, commits.nr);
    -+	if (num_large_edges)
    -+		write_graph_chunk_large_edges(f, commits.list, commits.nr);
    +-	write_graph_chunk_extra_edges(f, commits.list, commits.nr);
    ++	if (num_extra_edges)
    ++		write_graph_chunk_extra_edges(f, commits.list, commits.nr);
      
      	close_commit_graph(the_repository);
      	finalize_hashfile(f, NULL, CSUM_HASH_IN_STREAM | CSUM_FSYNC);
 3:  1126c7e29d <  -:  ---------- commit-graph write: rephrase confusing progress output
 4:  9c17f56ed3 !  4:  1860fa606e commit-graph write: add "Writing out" progress output
    @@ -10,14 +10,14 @@
         small repositories, but before this change we'd noticeably hang for
         2-3 seconds at the end on medium sized repositories such as linux.git.
     
    -    Now we'll instead show output like this, and have no human-observable
    -    point at which we're not producing progress output:
    +    Now we'll instead show output like this, and reduce the
    +    human-observable times at which we're not producing progress output:
     
    -        $ ~/g/git/git --exec-path=$HOME/g/git commit-graph write
    -        Finding commits for commit graph: 6365442, done.
    -        Annotating commit graph: 2391666, done.
    -        Computing commit graph generation numbers: 100% (797222/797222), done.
    -        Writing out commit graph: 100% (3188888/3188888), done.
    +        $ ~/g/git/git --exec-path=$HOME/g/git -C ~/g/2015-04-03-1M-git commit-graph write
    +        Finding commits for commit graph: 13064614, done.
    +        Expanding reachable commits in commit graph: 1000447, done.
    +        Computing commit graph generation numbers: 100% (1000447/1000447), done.
    +        Writing out commit graph: 100% (3001341/3001341), done.
     
         This "Writing out" number is 3x or 4x the number of commits, depending
         on the graph we're processing. A later change will make this explicit
    @@ -87,7 +87,7 @@
      		hashwrite(f, get_commit_tree_oid(*list)->hash, hash_len);
     @@
      
    - static void write_graph_chunk_large_edges(struct hashfile *f,
    + static void write_graph_chunk_extra_edges(struct hashfile *f,
      					  struct commit **commits,
     -					  int nr_commits)
     +					  int nr_commits,
    @@ -107,7 +107,7 @@
      		     parent = parent->next)
      			num_parents++;
     @@
    - 	int num_large_edges;
    + 	int num_extra_edges;
      	struct commit_list *parent;
      	struct progress *progress = NULL;
     +	uint64_t progress_cnt = 0;
    @@ -121,25 +121,16 @@
     -	write_graph_chunk_fanout(f, commits.list, commits.nr);
     -	write_graph_chunk_oids(f, GRAPH_OID_LEN, commits.list, commits.nr);
     -	write_graph_chunk_data(f, GRAPH_OID_LEN, commits.list, commits.nr);
    -+	if (report_progress) {
    -+		/*
    -+		 * Each of the write_graph_chunk_*() functions just
    -+		 * below loops over our N commits. This number must be
    -+		 * kept in sync with the number of passes we're doing.
    -+		 */
    -+		int graph_passes = 3;
    -+		if (num_large_edges)
    -+			graph_passes++;
    ++	if (report_progress)
     +		progress = start_delayed_progress(
     +			_("Writing out commit graph"),
    -+			graph_passes * commits.nr);
    -+	}
    ++			num_chunks * commits.nr);
     +	write_graph_chunk_fanout(f, commits.list, commits.nr, progress, &progress_cnt);
     +	write_graph_chunk_oids(f, GRAPH_OID_LEN, commits.list, commits.nr, progress, &progress_cnt);
     +	write_graph_chunk_data(f, GRAPH_OID_LEN, commits.list, commits.nr, progress, &progress_cnt);
    - 	if (num_large_edges)
    --		write_graph_chunk_large_edges(f, commits.list, commits.nr);
    -+		write_graph_chunk_large_edges(f, commits.list, commits.nr, progress, &progress_cnt);
    + 	if (num_extra_edges)
    +-		write_graph_chunk_extra_edges(f, commits.list, commits.nr);
    ++		write_graph_chunk_extra_edges(f, commits.list, commits.nr, progress, &progress_cnt);
     +	stop_progress(&progress);
      
      	close_commit_graph(the_repository);
 5:  79b0a467d9 !  5:  cc36909add commit-graph write: more descriptive "writing out" output
    @@ -11,11 +11,11 @@
         has to do with writing out the commit graph. Now e.g. on linux.git we
         emit:
     
    -        $ ~/g/git/git --exec-path=$HOME/g/git commit-graph write
    -        Finding commits for commit graph: 6365442, done.
    -        Annotating commit graph: 2391666, done.
    -        Computing commit graph generation numbers: 100% (797222/797222), done.
    -        Writing out commit graph in 4 passes: 100% (3188888/3188888), done.
    +        $ ~/g/git/git --exec-path=$HOME/g/git -C ~/g/linux commit-graph write
    +        Finding commits for commit graph: 6529159, done.
    +        Expanding reachable commits in commit graph: 815990, done.
    +        Computing commit graph generation numbers: 100% (815983/815983), done.
    +        Writing out commit graph in 4 passes: 100% (3263932/3263932), done.
     
         A note on i18n: Why are we using the Q_() function and passing a
         number & English text for a singular which'll never be used? Because
    @@ -36,26 +36,28 @@
      	if (!commit_graph_compatible(the_repository))
      		return;
     @@
    - 		int graph_passes = 3;
    - 		if (num_large_edges)
    - 			graph_passes++;
    + 		hashwrite(f, chunk_write, 12);
    + 	}
    + 
    +-	if (report_progress)
    ++	if (report_progress) {
     +		strbuf_addf(&progress_title,
     +			    Q_("Writing out commit graph in %d pass",
     +			       "Writing out commit graph in %d passes",
    -+			       graph_passes),
    -+			    graph_passes);
    ++			       num_chunks),
    ++			    num_chunks);
      		progress = start_delayed_progress(
     -			_("Writing out commit graph"),
     +			progress_title.buf,
    - 			graph_passes * commits.nr);
    - 	}
    + 			num_chunks * commits.nr);
    ++	}
      	write_graph_chunk_fanout(f, commits.list, commits.nr, progress, &progress_cnt);
    -@@
    - 		write_graph_chunk_large_edges(f, commits.list, commits.nr, progress, &progress_cnt);
    + 	write_graph_chunk_oids(f, GRAPH_OID_LEN, commits.list, commits.nr, progress, &progress_cnt);
    + 	write_graph_chunk_data(f, GRAPH_OID_LEN, commits.list, commits.nr, progress, &progress_cnt);
    + 	if (num_extra_edges)
    + 		write_graph_chunk_extra_edges(f, commits.list, commits.nr, progress, &progress_cnt);
      	stop_progress(&progress);
    - 
     +	strbuf_release(&progress_title);
    -+
    + 
      	close_commit_graph(the_repository);
      	finalize_hashfile(f, NULL, CSUM_HASH_IN_STREAM | CSUM_FSYNC);
    - 	commit_lock_file(&lk);
 6:  b32be83b38 !  6:  9a4aec16ca commit-graph write: show progress for object search
    @@ -8,12 +8,12 @@
     
         Before we'd emit on e.g. linux.git with "commit-graph write":
     
    -        Finding commits for commit graph: 6365442, done.
    +        Finding commits for commit graph: 6529159, done.
             [...]
     
         And now:
     
    -        Finding commits for commit graph: 100% (6365442/6365442), done.
    +        Finding commits for commit graph: 100% (6529159/6529159), done.
             [...]
     
         Since the commit graph only includes those commits that are packed
    @@ -59,7 +59,8 @@
     -				_("Finding commits for commit graph"), 0);
     +				_("Finding commits for commit graph"),
     +				approx_nr_objects);
    - 		for_each_packed_object(add_packed_commits, &oids, 0);
    + 		for_each_packed_object(add_packed_commits, &oids,
    + 				       FOR_EACH_OBJECT_PACK_ORDER);
     +		if (oids.progress_done < approx_nr_objects)
     +			display_progress(oids.progress, approx_nr_objects);
      		stop_progress(&oids.progress);
 7:  54276723c0 !  7:  6b523d0f0d commit-graph write: add more descriptive progress output
    @@ -10,17 +10,17 @@
         we support:
     
             $ git commit-graph write
    -        Finding commits for commit graph among packed objects: 100% (6365442/6365442), done.
    +        Finding commits for commit graph among packed objects: 100% (6529159/6529159), done.
             [...]
     
             # Actually we don't emit this since this takes almost no time at
             # all. But if we did (s/_delayed//) we'd show:
             $ git for-each-ref --format='%(objectname)' | git commit-graph write --stdin-commits
    -        Finding commits for commit graph from 584 refs: 100% (584/584), done.
    +        Finding commits for commit graph from 630 refs: 100% (630/630), done.
             [...]
     
             $ (cd .git/objects/pack/ && ls *idx) | git commit-graph write --stdin-pack
    -        Finding commits for commit graph in 2 packs: 6365442, done.
    +        Finding commits for commit graph in 3 packs: 6529159, done.
             [...]
     
         The middle on of those is going to be the output users might see in
    @@ -89,5 +89,5 @@
     -				_("Finding commits for commit graph"),
     +				_("Finding commits for commit graph among packed objects"),
      				approx_nr_objects);
    - 		for_each_packed_object(add_packed_commits, &oids, 0);
    - 		if (oids.progress_done < approx_nr_objects)
    + 		for_each_packed_object(add_packed_commits, &oids,
    + 				       FOR_EACH_OBJECT_PACK_ORDER);
 8:  0e847366e1 =  8:  582f6f477f commit-graph write: remove empty line for readability
 9:  c388aff73e !  9:  2c9fa68210 commit-graph write: add itermediate progress
    @@ -10,26 +10,29 @@
         million objects we'll now emit:
     
             $ ~/g/git/git --exec-path=$HOME/g/git commit-graph write
    -        Finding commits for commit graph among packed objects: 100% (48333911/48333911), done.
    -        Annotating commit graph: 21435984, done.
    -        Counting distinct commits in commit graph: 100% (7145328/7145328), done.
    -        Finding extra edges in commit graph: 100% (7145328/7145328), done.
    -        Computing commit graph generation numbers: 100% (7145328/7145328), done.
    -        Writing out commit graph in 4 passes: 100% (28581312/28581312), done.
    +        Finding commits for commit graph among packed objects: 100% (124763727/124763727), done.
    +        Loading known commits in commit graph: 100% (18989461/18989461), done.
    +        Expanding reachable commits in commit graph: 100% (18989507/18989461), done.
    +        Clearing commit marks in commit graph: 100% (18989507/18989507), done.
    +        Counting distinct commits in commit graph: 100% (18989507/18989507), done.
    +        Finding extra edges in commit graph: 100% (18989507/18989507), done.
    +        Computing commit graph generation numbers: 100% (7250302/7250302), done.
    +        Writing out commit graph in 4 passes: 100% (29001208/29001208), done.
     
         Whereas on a medium-sized repository such as linux.git these new
         progress bars won't have time to kick in and as before and we'll still
         emit output like:
     
             $ ~/g/git/git --exec-path=$HOME/g/git commit-graph write
    -        Finding commits for commit graph among packed objects: 100% (6365442/6365442), done.
    -        Annotating commit graph: 2391666, done.
    -        Computing commit graph generation numbers: 100% (797222/797222), done.
    -        Writing out commit graph in 4 passes: 100% (3188888/3188888), done.
    +        Finding commits for commit graph among packed objects: 100% (6529159/6529159), done.
    +        Expanding reachable commits in commit graph: 815990, done.
    +        Computing commit graph generation numbers: 100% (815983/815983), done.
    +        Writing out commit graph in 4 passes: 100% (3263932/3263932), done.
     
         The "Counting distinct commits in commit graph" phase will spend most
         of its time paused at "0/*" as we QSORT(...) the list. That's not
    -    optimal, but at least we don't seem to be stalling anymore.
    +    optimal, but at least we don't seem to be stalling anymore most of the
    +    time.
     
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx>
     
    @@ -59,7 +62,7 @@
     @@
      	ALLOC_ARRAY(commits.list, commits.alloc);
      
    - 	num_large_edges = 0;
    + 	num_extra_edges = 0;
     +	if (report_progress)
     +		progress = start_delayed_progress(
     +			_("Finding extra edges in commit graph"),
    @@ -73,7 +76,7 @@
     @@
      		commits.nr++;
      	}
    - 	num_chunks = num_large_edges ? 4 : 3;
    + 	num_chunks = num_extra_edges ? 4 : 3;
     +	stop_progress(&progress);
      
      	if (commits.nr >= GRAPH_PARENT_MISSING)
10:  fd692499e0 <  -:  ---------- commit-graph write: emit a percentage for all progress
 -:  ---------- > 10:  e69bb1be2c commit-graph write: emit a percentage for all progress
-- 
2.20.1.153.gd81d796ee0




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux