On 12/6/2018 6:36 PM, Jonathan Tan wrote:
AFAICT oid_object_info doesn't take advantage of the commit graph,
but just looks up the object header, which is still less than completely
parsing it. Then lookup_commit is overly strict, as it may return
NULL as when there still is a type mismatch (I don't think a mismatch
could happen here, as both rely on just the object store, and not the
commit graph.), so this would be just defensive programming for
the sake of it. I dunno.
struct commit *c;
if (oid_object_info(revs->repo, oid, NULL) == OBJ_COMMIT &&
(c = lookup_commit(revs->repo, oid)) &&
!repo_parse_commit(revs->repo, c))
object = (struct object *) c;
else
object = parse_object(revs->repo, oid);
I like this way better - I'll do it in the next version.
If we do _not_ have a commit-graph or if the commit-graph does not have
that commit, this will have the same performance problem, right?
Should we instead create a direct dependence on the commit-graph, and try
to parse the oid from the graph directly? If it succeeds, then we learn
that the object is a commit, in addition to all of the parsing work. This
means we could avoid oid_object_info() loading data if we succeed. We
would fall back to parse_object() if it fails.
I was thinking this should be a simple API call to parse_commit_in_graph(),
but that requires a struct commit filled with an oid, which is not the
best idea if we don't actually know it is a commit yet.
The approach I recommend would then be more detailed:
1. Modify find_commit_in_graph() to take a struct object_id instead of a
struct commit. This helps find the integer position in the graph. That
position can be used in fill_commit_in_graph() to load the commit
contents. Keep find_commit_in_graph() static as it should not be a
public function.
2. Create a public function with prototype
struct commit *try_parse_commit_from_graph(struct repository *r, struct
object_id *oid)
that returns a commit struct fully parsed if and only if the repository
has that oid. It can call find_commit_in_graph(), then
lookup_commit() and
fill_commit_in_graph() to create the commit and parse the data.
3. In replace of the snippet above, do:
struct commit *c;
if ((c = try_parse_commit_from_graph(revs->repo, oid))
object = (struct object *)c;
else
object = parse_object(revs->repo, oid);
A similar pattern _could_ be used in parse_object(), but I don't recommend
doing this pattern unless we have a reasonable suspicion that we are going
to parse commits more often than other objects. (It adds an O(log(#
commits))
binary search to each object.)
A final thought: consider making this "try the commit graph first, but fall
back to parse_object()" a library function with a name like
struct object *parse_probably_commit(struct repository *r, struct
object_id *oid)
so other paths that are parsing a lot of commits (but also maybe tags) could
use the logic.
Thanks!
-Stolee