[Please send replies Cc: git mailing list] Robert Fitzsimons wrote: > While looking at the gitweb source yesterday, I noticed a number of > similar expensive workflows used by a number of actions (summary, > shortlog, log, rss, atom, and history). > > The current workflows are: > get ~100 sha1's using rev-list > foreach sha1 > get/parse 1 commit using rev-list > output commit > > The new workflows I'm proposing would be: > get/parse ~100 commit's using rev-list > foreach commit > output commit I have tried this approach too. Take a look at http://repo.or.cz/w/git/jnareb-git.git?a=log;h=Attic/gitweb/parse_rev_list or at discussion started with Message-Id: <200609061504.40725.jnareb@xxxxxxxxx> http://mid.gmane.org/200609061504.40725.jnareb@xxxxxxxxx > The following simplified commands gives an idea of the git only overhead > between these two workflows. > > time \ > for r in `git-rev-list --max-count=100 HEAD --` ; \ > do git-rev-list --header --parents --max-count=1 $r -- ; \ > done > /dev/null > > real 0m0.490s > user 0m0.224s > sys 0m0.228s > > time \ > git-rev-list --header --parents --max-count=100 HEAD -- > /dev/null > > real 0m0.058s > user 0m0.008s > sys 0m0.004s > > There would seems to be a benefit from making the proposed change to > these workflows, when run on my machine against a clone of Linus's tree. The problem is that it works only for "log" and "shortlog" views, but it doesn't work for "history" view. Now both share the same infrastructure. The problem is that when there is path limiter (be it file or directory) the history is simplified, and parents are _rewritten_ according to simplified history. And this happen depending on strange combination of --header, --parents and --full-history. Should be somewhere in archives. And we don't want to use parents from commit object, because there might be grafts, or it might be shallow clone. On the other hand, we don't really need parents for log, shortlog and history... > One issue with this change is that, gitweb is page orientated. Page 0 > shows the first 100 items from a given hash, page 1 uses the same given > hash but show 100 to 199 items, etc. Using 'git-rev-list --header > --parents' and then throwing away most of the result is very wasteful. > > So I'm suggesting we add a new option to git-rev-list which will only > start show results once its has iterated past a given number of items. > Using a caret or tilde doesn't seem to return the same result. > > I've attached a discussion patch which adds a new option --start-count > to git-rev-list and changed the summary and showlog actions of gitweb to > use this new option. Very nice idea. > I'm sure there are many improvements to this patch, comments? Perhaps this patch should be split in two? (Usually either second mail is reply to first mail, or both are replies to introductory letter, usually with table of contents and diffstat of series). [...] Documentation (of --start-count / --skip option), please? P.S. Thanks for the patches. P.P.S. Do you have any comments to latest "[RFC] gitweb wishlist and TODO list" series? -- Jakub Narebski Warsaw, Poland ShadeHawk on #git - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html