Re: [PATCH] (experimental) per-topic shortlog.

Junio C Hamano <junkio@xxxxxxx> · Sun, 26 Nov 2006 22:48:32 -0800

Linus Torvalds <torvalds@xxxxxxxx> writes:

> I think that's actually likely the exception rather than the rule. It's 
> much more likely that people have almost _all_ active development done on 
> side branches, and that - together with rebasing of the side branches - 
> inevitably means that the "main branch" ends up not having such a clean 
> set of "topic branch" merges.

You are absolutely right about "Andrew patchbomb" which is
linear and does not have the series boundary.  Import from
mostly linear foreign SCM would have the same issue.  Merge
topology would not help us at all in these cases.

> In addition, on a more mature tree, a lot (probably _most_) of the commits 
> aren't really "topics" at all, but "maintenance", which exacerbates the 
> problem: you don't have a "line of development of this feature",
> ...
> Put another way: bugs get fixed one by one, not in a nice linear fashion 
> by "topic".

Again, you are right, but that only means topic based grouping
is not for everybody, and certainly is not suitable for a long
stretch of commits on the trunk of a mature project because they
tend to touch everywhere and not all that clustered.  If those
bugs were fixed by committing on separate topic branches and
then later merged, the topology based clustering would get the
grouping right, but I would imagine we would end up seeing
hundreds of such short groups which would not be useful at all.
In such cases, it would be much more useful to have one huge
group that says "these are small fixes, each of which may touch
different areas -- they are not related but grouped together
because they are all small, obviously correct and harmless
fixes".  So I suspect that is a slightly different issue -- it
just illustrates the need for an "ungrouped" bin.

> So I'm coming at it from a totally different project - where "topic 
> branches" simply aren't delineated as much, and even when they are, they 
> tend to be merged in multiple steps (and they pull both ways when they 
> aren't re-based).

I agree multiple steps merge and merging both ways would happen
in real life, but I had an impression that fpc handles that
topology reasonably well, unless that "merge from upstream" are
of "too frequent, automated and useless" kind of merges.

> ... but in the kernel, I pretty much guarantee 
> that you probably get better "topic clustering" by going simply by author, 
> like the old standard "git shortlog" does. Because that will tend to get 
> the clustering at a finer granularity (ie not just "networking", but 
> things like "packet filtering" etc).
>
> So the "sort by people" actually works fairly well, but it's kind of an 
> "incidental" thing, and it _would_ be potentially useful to have other 
> ways of grouping things.

I think "networking" vs "packet filtering" largely depends on
how the networking subsystem you pull from is managed.  If
netfilter comes as e-mailed patches to DaveM and are applied
onto the trunk of networking subsystem, we will face exactly the
same problem as we have with Andrew's patchbomb to your trunk.

If it were managed on a separate topic branch in the networking
subsystem repository (either DaveM manages them in his
repository as a topic, or DaveM pulls from netfilter git
repository -- I do not know how that part of the patchflow
works), I would imagine you would get the same "per topic"
grouping.

Another factor is that the author population of a wide and
mature project like the kernel tends to be more diverse, and a
single person tends to be focused on one thing at a time while
others work on different things.  There is enough work in one
specific area for one person to do, and the project is too wide
for one person to be everywhere.

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html