On Thu, Jan 25, 2018 at 6:02 AM, Derrick Stolee <stolee@xxxxxxxxx> wrote: > Teach Git to inspect a packed graph to supply the contents of a > struct commit when calling parse_commit_gently(). This implementation > satisfies all post-conditions on the struct commit, including loading > parents, the root tree, and the commit date. The only loosely-expected > condition is that the commit buffer is loaded into the cache. This > was checked in log-tree.c:show_log(), but the "return;" on failure > produced unexpected results (i.e. the message line was never terminated). > The new behavior of loading the buffer when needed prevents the > unexpected behavior. > > If core.graph is false, then do not load the graph and behave as usual. > > In test script t5319-graph.sh, add output-matching conditions on read- > only graph operations. > > By loading commits from the graph instead of parsing commit buffers, we > save a lot of time on long commits walks. Here are some performance > results for a copy of the Linux repository where 'master' has 704,766 > reachable commits and is behind 'origin/master' by 19,610 commits. > > | Command | Before | After | Rel % | > |----------------------------------|--------|--------|-------| > | log --oneline --topo-order -1000 | 5.9s | 0.7s | -88% | > | branch -vv | 0.42s | 0.27s | -35% | > | rev-list --all | 6.4s | 1.0s | -84% | > | rev-list --all --objects | 32.6s | 27.6s | -15% | This sounds impressive! > @@ -383,19 +384,27 @@ int parse_commit_gently(struct commit *item, int quiet_on_missing) > > if (!item) > return -1; > + > + // If we already parsed, but got it from the graph, then keep going! comment style. > if (item->object.parsed) > return 0; > + > + if (check_packed && parse_packed_commit(item)) > + return 0; > + > buffer = read_sha1_file(item->object.oid.hash, &type, &size); > if (!buffer) > return quiet_on_missing ? -1 : > error("Could not read %s", > - oid_to_hex(&item->object.oid)); > + oid_to_hex(&item->object.oid)); > if (type != OBJ_COMMIT) { > free(buffer); > return error("Object %s not a commit", > - oid_to_hex(&item->object.oid)); > + oid_to_hex(&item->object.oid)); > } > + > ret = parse_commit_buffer(item, buffer, size); > + I guess the new lines are for readability? Not sure if will play out nicely with merges in this area, though. (I touch this area of the code as well in the not yet sent out series adding the repository as an argument all over the place. Not your problem, just me getting anxious) > @@ -34,6 +34,8 @@ > #define GRAPH_CHUNKLOOKUP_SIZE (5 * 12) > #define GRAPH_MIN_SIZE (GRAPH_CHUNKLOOKUP_SIZE + GRAPH_FANOUT_SIZE + \ > GRAPH_OID_LEN + sizeof(struct packed_graph_header)) > +/* global storage */ > +struct packed_graph *packed_graph = 0; > > struct object_id *get_graph_head_oid(const char *pack_dir, struct object_id *oid) > { > @@ -209,6 +211,225 @@ struct packed_graph *load_packed_graph_one(const char *graph_file, const char *p > return graph; > } > > +static void prepare_packed_graph_one(const char *obj_dir) > +{ > + char *graph_file; > + struct object_id oid; > + struct strbuf pack_dir = STRBUF_INIT; > + strbuf_addstr(&pack_dir, obj_dir); > + strbuf_add(&pack_dir, "/pack", 5); > + > + if (!get_graph_head_oid(pack_dir.buf, &oid)) > + return; > + > + graph_file = get_graph_filename_oid(pack_dir.buf, &oid); > + > + packed_graph = load_packed_graph_one(graph_file, pack_dir.buf); > + strbuf_release(&pack_dir); > +} > + > +static int prepare_packed_graph_run_once = 0; Okay. :( Seeing new globals like these, gives me extra motivation to get the object store series going.