FYI, my comments don't fully cover this patch yet. Robin Rosenberg <robin.rosenberg@xxxxxxxxxx> wrote: > diff --git a/org.spearce.jgit/src/org/spearce/jgit/revplot/PlotCommit.java b/org.spearce.jgit/src/org/spearce/jgit/revplot/PlotCommit.java > index 5a5ef1e..fac89f5 100644 > --- a/org.spearce.jgit/src/org/spearce/jgit/revplot/PlotCommit.java > +++ b/org.spearce.jgit/src/org/spearce/jgit/revplot/PlotCommit.java > @@ -58,14 +59,19 @@ > > PlotCommit[] children; > > + Ref[] refs; > + > /** > * Create a new commit. > * > * @param id > * the identity of this commit. > + * @param tags > + * the tags associated with this commit, null for no tags > */ > - protected PlotCommit(final AnyObjectId id) { > + protected PlotCommit(final AnyObjectId id, final Ref[] tags) { > super(id); > + this.refs = tags; this.refs isn't final. There is no reason to be adding it to the constructor and having this ripple effect all the way up into the RevWalk class. Maybe PlotWalk should override next() and only set PlotCommit.refs on things returned from super.next()? This way we only do tag lookup on commits that the filtering rules have said should be shown in to the application, but the refs should still be inserted prior to the application seeing the RevCommit. > @@ -54,6 +66,7 @@ > public PlotWalk(final Repository repo) { > super(repo); > super.sort(RevSort.TOPO, true); > + reverseRefMap = repo.getAllRefsByPeeledObjectId(); I wonder if this shouldn't be done as part of the StartGenerator (or something like it but specialized for PlotWalk). I only say that because a reused PlotWalk may want to make sure its ref map is current with the repository its about to walk against. > + @Override > + protected Ref[] getTags(final AnyObjectId commitId) { > + List<Ref> list = reverseRefMap.get(commitId); > + Ref[] tags; > + if (list == null) > + tags = null; > + else { > + if (list != null && list.size() > 1) { > + Collections.sort(list, new Comparator<Ref>() { > + public int compare(Ref o1, Ref o2) { > + try { > + Object obj1 = getRepository().mapObject(o1.getObjectId(), o1.getName()); > + Object obj2 = getRepository().mapObject(o2.getObjectId(), o2.getName()); > + long t1 = timeof(obj1); > + long t2 = timeof(obj2); > + if (t1 > t2) > + return -1; > + if (t1 < t2) > + return 1; > + return 0; > + } catch (IOException e) { > + // ignore > + return 0; > + } > + } > + long timeof(Object o) { > + if (o instanceof Commit) > + return ((Commit)o).getCommitter().getWhen().getTime(); > + if (o instanceof Tag) > + return ((Tag)o).getTagger().getWhen().getTime(); > + return 0; You are inside of a PlotWalk, which extends RevWalk, which has very aggressive caching of RevCommit and RevTag data instances, and very fast parsers. Much faster than the legacy Commit and Tag classes. I think we should use the RevCommit and RevTag classes to handle sorting here. RevCommit already has the committer time cached and stored in an int. RevCommit probably should learn to do the same for its "tagger" field. The tiny extra bloat (1 word) that adds to a RevTag instance is worth it when we consider implementing something like this and/or git-describe where sorting tags by their dates is useful. > + tags = list.toArray(new Ref[list.size()]); I wonder if using a Ref[] here even makes sense given that the data is stored in a List<Ref>. I use RevCommit[] inside of RevCommit generally because the number of entries in the array is 1 or 2 and the array is smaller than say an ArrayList. In hindsight those RevCommit[] probably should be a List<RevCommit> with different list implementations based on the number of parents needed: 0 parents: Collections.emptyList() 1 parent: Collections.singletonList() 2 parents: some especialized AbstractList subclass 3 parents: ArrayList I think it would actually use less memory per instance, which is a huge bonus, but we'd pay a downcasting penalty on each access. HotSpot is supposed to be really good at removing the downcast penalty from say java.util.List, but I don't if it can beat a typed array access. Sorry I got off on a bit of a tangent here. I'm just trying to point out that the primary reason I've usd an array before is probably moot here since the data is already in a List. > diff --git a/org.spearce.jgit/src/org/spearce/jgit/revwalk/RevWalk.java b/org.spearce.jgit/src/org/spearce/jgit/revwalk/RevWalk.java > index d7e4c58..41d57c6 100644 > --- a/org.spearce.jgit/src/org/spearce/jgit/revwalk/RevWalk.java > +++ b/org.spearce.jgit/src/org/spearce/jgit/revwalk/RevWalk.java > @@ -53,6 +53,7 @@ IMHO this class shouldn't need to be modified. > @@ -541,7 +542,7 @@ public RevTree lookupTree(final AnyObjectId id) { > public RevCommit lookupCommit(final AnyObjectId id) { > RevCommit c = (RevCommit) objects.get(id); > if (c == null) { > - c = createCommit(id); > + c = createCommit(id, getTags(id)); This code is performance critical to commit parsing. Trying to lookup tags for every commit we evaluate, especially ones that we will never show in the UI (because they are uninteresting) but that we still need to parse in order to derive the merge base is just a huge waste of time. Same applies for the lookupAny and parseAny methods. -- Shawn. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html