On Tue, 7 Nov 2006, Liu Yubao wrote: > > Fake commit is only for digging branch scope history, I can *outline* what has > been merged to a branch and don't care about how these good work are done on > earth. The thing is, I think you see a good thing ("outlining"), and miss all the downsides ("extra noise", "incorrect outlining"). Yes, I can see it being useful for reading logs in a perfect world. However, in real life, more than half of my fast-forwards are just me tracking another branch. An "outline" would be _wrong_. I _want_ to fast-forward, because I'm moving the trees from one machine to another, and the reason it's a fast-forward is exactly the fact that absolutely zero work had been done on the machine I'm pulling from - I'm pulling just to keep up-to-date. So now, just to keep things sane, your scheme would require that people AHEAD OF TIME tell the system whether they want to fast-forward or whether they want to create a magic merge commit as a "outlining" marker. See? Fast-forwarding is absolutely the right thing to do in 99% of all cases. For me, it's perhaps only half, because I do several true merges every day, but that's really quite unusual - I'm the top-level maintainer. Nobody else should EVER do it. And the thing is, I refuse to work with a system that makes one person special. I _know_ I'm special, I'm the smartest, most beautiful, and just simply the best person on the planet. I don't need a tool that tells me so. So deep down, what you're really suggesting that there be a special mode that is ONLY ever used for the top-level maintainer, so that he can create an "outline" in the history. Put that way, it almost makes sense, until you realize that 99.9% of all people aren't top-level maintainers, and you don't want them creating crap like that. And that "outlining" is likely most easily done with ( git log lastversion.. | git shortlog ; git diff --stat --summary lastversion.. ) | less -S instead. But more importantly, I don't personally like the "top-level maintainer" model. Yes, it's how people do end up working a lot, but quite frankly, I'd rather not have the tool support it, especially if there is ever a schism in a development process. I want to support _forking_, which very much implies having somebody pulling the "wrong way". Time for some purely philosophical arguments on why it's wrong to have "special people" encoded in the tools: I think that "forking" is what keeps people honest. The _biggest_ downside with CVS is actually that a central repository gets so much _political_ clout, that it's effectively impossible to fork the project: the maintainers of a central repo have huge powers over everybody else, and it's practically impossible for anybody else to say "you're wrong, and I'll show how wrong you are by competing fairly and being better". For example, gcc (and other tools) have gone through this phase. You've had splinter groups (eg pgcc) that did a hell of a lot better work than the main group, and the tools really made it really hard for them to make progress. I think the most important part of a distributed SCM is not even to support the "main trunk", but to support the notion that anybody can just take the thing and compete fairly. With the kernel as an example, any group could literally just start their own kernel git tree, and git should make it as easy as humanly possible for them to track my tree WHILE _THEY_ STILL REMAIN IN CHARGE of their own tree. That doesn't mean that forking is easy - over the years people have simply grown so _used_ to me that they mostly trust me and they are comfy working with me, because even if I've got my quirks (or "major personality disorders" as some people might say), people mostly know how to work with them. But the point is, there should be no _tool_ issues. As far as git is concerned, every single developer can feel like he is the top-level maintainer - it doesn't have to be a hierarchy, it really can be a "network of equal developers". I want the _tool_ to have that world-view, even if most projects in the end tend to organize more hierarcically than that. Because the "everybody is equal" worldview actually matters in the only case that _really_ matters: when problems happen. For example: I use git to maintain a few other projects I've started too. I use git to maintain git itself, but I'm no longer the maintainer, simply because I think it's a lot better to step down than stand in the way of somebody better, and because I think it's hard to be the "lead person" on multiple projects. The same thing is happening to "sparse", which was dormant for a while (it worked, and I fixed problems as people reported them, but it did everything I had set out to do, so my motivation to develop it further had just gone down a lot). What happened? Somebody else came along, showed interest, started sending me patches, and I just suggested he start his own tree and start maintaining it. Now, both of those transitions were very peaceful, but it should work that way even if the maintainer were to fight tooth and nail to hold on to his "top dog" status. And that's where it's important that the tool not separate out "top maintainers" from "other people". > I want to separate a branch, not to separate commits by some author, for > example, many authors can contribute to git's master branch, I want to > know what happened in the master branch like this: > good work from A; > good work from C; > merge from next; -----> I don't care how this feature is realized. > good work from A; Really, "git log | git shortlog" will come quite close. I use it all the time for the kernel, and it's powerful. Try it with the kernel archive, just for fun. Do git log v2.6.19-rc4.. | git shortlog | less -S with the current kernel, and see how easy it is to get a kind of feel for what is going on. We do it by two means: - sorting by author. This sounds silly, but it's actually very powerful. It's not so much that it credits people better (it does) or that it makes the logs shorter by mentioning the person just once (it does that too), it's really nice because people tend to automatically do certain things. One person does "random cleanups". Another one works on "networking". A third one maintains one particular architecture, and so on.. - encourage people to have a "topic: explanation" kind of top line of the commit (and encourage people to have that "summary line" in the first place: not every SCM does that, and everybody else is strictly much worse than git) In fact, when I do this, I usually _remove_ the merges, because they end up being just noise. Really: go and look at the current kernel repo, and do the above one-liner, and realize that I have a hunking big set of commits credited to me right now (it says 30 commits), and in fact I think I'm the #1 author right now on that list. But when I send out the description, I actually use the "--no-merges" flag to "git log", because those merge messages are _useless_. They really don't do anything at all for me, or for anybody else. Re-run the above one-liner that way, and suddenly I drop to just 5 commits (and quite often, I'm much less - sometimes the _only_ commit I have for an -rc release is the commit that changes the version number). But it's actually more readable. So I can kind of see what you want, but I'm 100% convinced that the information you _really_ want is better done totally differently. So if you want to get the "big picture" thing, git does actually support you in several ways. That "git shortlog" is very useful, but so is the "drill down by subsystem". For example, you could do git log --no-merges v2.6.19-rc4.. arch/ | git shortlog | less -S and you'd get the "summary view" of what happened in architecture- specific code. It's not the same thing as the "merge log", but it's actually very useful. (You can do the same with git. Something like git log --no-merges v1.4.3.4.. | git shortlog | less -S shows quite clearly that a lot of new stuff is gitweb-related, for example. Could we do better "reporting" tools? I'm absolutely sure we could. It might be interesting to be able to ignore not just commits, but "trivial patches" too. For example, if you're looking for what changed on a high level, you're not likely to care about patches that change just a few lines. You might want to see only the commits that change an appreciable fraction of code, and so it might be very interesting to have a "git shortlog" that would take patch size into account, for example. So I'm not saying that git is perfect. I'm just saying that there are better ways (with much fewer downsides) to get what you want, than the way you _think_ you want. Linus - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html