On Sat, Apr 17, 2021 at 12:22 AM Junio C Hamano <gitster@xxxxxxxxx> wrote: > > Jerry Zhang <jerry@xxxxxxxxxx> writes: > > > On Fri, Apr 16, 2021 at 5:45 PM Junio C Hamano <gitster@xxxxxxxxx> wrote: > >> > >> Jerry Zhang <jerry@xxxxxxxxxx> writes: > >> > >> > Add the --exclude-path-first-parent flag, > >> > which works similarly to --first-parent, > >> > but affects only the graph traversal for > >> > the set of commits being excluded. > >> > > >> > -A-------E-HEAD > >> > \ / > >> > B-C-D > >> > > >> > In this example, the goal is to return the > >> > set {B, C, D} which represents a working > >> > branch that has been merged into main branch > >> > E. `git rev-list D ^E` will end up returning > >> > no commits since the exclude path eliminates > >> > D and its ancestors. > >> > `git rev-list --exclude-path-first-parent D ^E` > >> > however will return {B, C, D} as desired. > >> > >> It is not clera why you want to have this, instead of doing a more > >> obvious "D..E^". Even better is "E^..E", which is often what you > >> want when viewing a history like my 'seen' that is a straight-line > >> into which tips of branches are merged. > > My motivation is to find the point at which a release branch forked off from > > a main branch, even though the release branch could have been merged > > into the main branch multiple times since it was forked off. > > > > If we add another merge from release to main, it will be more clear > > that those give different results: > > > > -A-----E-F-main > > \ / / > > B-C-D-release > > > > `git rev-list --exclude-path-first-parent release ^main` returns {B, C, D}. > > I've added commit F to show that we don't necessarily have info on E, > > there could be many commits between it and the tip of main. > > OK, you meant to deal with repeated merges into integration branch. > > So the idea is to just name the end point merge, say F (you also > could name D as the starting point, but see below), and > > - initially mark its first parent as UNINTERESTING (i.e. E), and > other parents as INTERESTING (i.e. D). > > - run the revision traversal machinery, but when propagating the > UNINTERESTING bit, give it only to the first parent. The second > and later parents won't become UNINTERESTING. > > - stop after we exhaust INTERESTING commits. > > It would probably work for your idealized topology, but I do not > know what happens when there are criss-cross merges. In the revised > picture, you are merging down from the B-C-D chain into the > mainline, but once the B-C-D chain becomes longer and diverges too > much from the mainline, it becomes tempting to break the "merge only > in one direction" discipline and merge back from the mainline, to > "catch up", and such a merge will have the history of B-C-D line of > development as its first parent. Would that screw up the selection > of which line of development is uninteresting? Yeah this flag (as well as the --first-parent flag) is mainly only useful because "git merge" will always put the "branch you're on" as parent 1 and the "branch being merged in" as parent 2. It is possible to break this assumption with either commit-tree or by merging while on one branch and pushing to another, but then the user should understand the consequences of doing so. In our case this isn't possible because a server handles all merges into the main branches. > > >> > Add the --exclude-path-first-parent flag, > >> > which works similarly to --first-parent, > >> > but affects only the graph traversal for > >> > the set of commits being excluded. > >> > > >> > -A-------E-HEAD > >> > \ / > >> > B-C-D > > In any case, it was totally unclear from the proposed log messsage, > and the overlong option name that does not say much did not help me > guess what you wanted to do with it. Specifically, it is not clear > what "exclude" means (we do not usually use the word in the context Exclude appears in the first paragraph of the man for git rev-list: " List commits that are reachable by following the parent links from the given commit(s), but exclude commits that are reachable from the one(s) given with a ^ in front of them. The output is given in reverse chronological order by default." It appears 5+ more times in the man page with the same meaning. > of revision traversal), and when we talk about "path" in the context > of revision traversal, we almost always mean the paths to the files, > i.e. pathspec that limits and simplifies the shape of the history. "path" is used in the same man page for the flag "--ancestry-path". I agree that it could be ambiguous though, so perhaps "chain" would be better. > Also, it claims that it works similarly to --first-parent, but what > you are doing is to propagate UNINTERESTING bit on the first-parent > chain, which ends up showing the side branch (i.e. B-C-D chain), > without showing the commits on the first-parent chain (A and E). > > What are the words that convey the idea behind this operation > clearly at the conceptual level? Let's think aloud to see if we can > come up with a better name. > > * first parents are unintertesting > > * show commits on side branch(es) > > * follow side branch. > > I think that is closer to the problem you are solving, if I > understand what you wrote above correctly. > > Perhaps --show-side-branch or --follow-side-branch? I dunno. For my particular use-case I am using it in combination with --first-parent and a single include and exclude commit to show the commits on the "side-branch" of the include commit. But if you specify multiple commits for either or don't use --first-parent, the behavior is different and I don't think "--side-branch" describes it well in those cases. Since I don't believe I can predict all use-cases for the flag, I'd rather name it by what it "does" rather than what it is "for". If we're concerned about length, maybe "first-parent-not" could get the meaning across: - for "rev-list --first-parent A --not B" only first parents are visited along A's ancestry - for "rev-list --first-parent-not A --not B" it might be reasonable that since B is a "not" commit, only first parents are visited along B's ancestry. Overall I don't think we can make a name so clear that the user can avoid the man page anyway.