Robin Rosenberg <robin.rosenberg@xxxxxxxxxx> wrote: > måndagen den 6 oktober 2008 10.08.34 skrev Shawn O. Pearce: > > Maybe PlotWalk should override next() [...] > > I missed something here I think. Thanks. Maybe I thought... doesn't matter. > I'll rewrite the changes in and related to this completely. OK, looking forward to the changes. > > > @@ -54,6 +66,7 @@ > > > public PlotWalk(final Repository repo) { > > > super(repo); > > > super.sort(RevSort.TOPO, true); > > > + reverseRefMap = repo.getAllRefsByPeeledObjectId(); > > > > I wonder if this shouldn't be done as part of the StartGenerator > > (or something like it but specialized for PlotWalk). I only say > > that because a reused PlotWalk may want to make sure its ref map > > is current with the repository its about to walk against. > > noted. Be careful. StartGenerator is package access only and I think PlotWalk is in a different package. IMHO binding into the StartGenerator (by subclassing it and replacing it) feels like the right thing to me, but it means making a bunch of changes to RevWalk and StartGenerator to make that type public and allow PlotWalk to inject a different subclass of StartGenerator for itself. The StartGenerator trick is very useful though; it allows the RevWalk to perform init activity only on the first invocation of next() and then removes the init code entirely from the call path. It is a virtual dispatch, but I figured its a virtual dispatch anyway due to the way the other generators are chained together based the filters so we at least avoid a "if (first) {...}" test on every call to next. > > You are inside of a PlotWalk, which extends RevWalk, which has very > > aggressive caching of RevCommit and RevTag data instances, and very > > fast parsers. Much faster than the legacy Commit and Tag classes. > > Maybe we should rid us of them completely and make new commit from > these classes of possible more lightweight ones. The old classes were > straightforward but are on overtime now. I'm trying to go in that direction. However there are three reasons why we aren't ready yet: - Commits can only be made by the old-style Commit class. - Tags can only be made by the old-style Tag class. - Trees can only be made by the old-style Tree class. I have some experimental code in my merge branch to solve the last one by teaching DirCache how to write out the index and update the cache trees with the ObjectIds of the newly created trees, but I have not had a chance to clean the patches up with test cases and docs to submit to the git ML yet. So I'm actively working on fixing the last item. Oh, and of course existing callers have to be converted. But I am trying to avoid introducing new callers. > As you noted earlier we usually have at most one ref per commit to sort. > Not much to optimize for speed here with current repos. For a one-element > list the compare routine wil not even be invoked. Oh, good point. > > In hindsight those RevCommit[] probably should be a List<RevCommit> > > with different list implementations based on the number of parents > > needed: > > > > 0 parents: Collections.emptyList() > > 1 parent: Collections.singletonList() > > 2 parents: some especialized AbstractList subclass > > 3 parents: ArrayList > > I guess measuring is the right way. For these small arrays, my bet > is that List is way faster because we do not need any bounds checking. > Hotspot is reasonably good at optimizing those away, but that won't > help for much small arrays. Yup. Its something to explore. But I lack the time right now so I'm going going to be making any changes and measuring it anytime soon. ;-) Maybe someone who wants to get involved can look at this. Or any other number of issues we still have open. But without measurements to back up the code either way I won't be making changes to what we have right now. -- Shawn. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html