Am 29.01.2011 01:02, schrieb Junio C Hamano: > Renà Scharfe <rene.scharfe@xxxxxxxxxxxxxx> writes: > >> Subject: pickaxe: don't simplify history too much >> >> If pickaxe is used, turn off history simplification and make sure to keep >> merges with at least one interesting parent. >> >> If path specs are used, merges that have at least one parent whose files >> match those in the specified subset are edited out. This is good in >> general, but leads to unexpectedly few results if used together with >> pickaxe. Merges that also have an interesting parent (in terms of -S or >> -G) are dropped, too. >> >> This change makes sure pickaxe takes precedence over history >> simplification. > > Hmmm, I understand the _motivation_ behind the change in the second hunk, > in that you _might_ want to dig the side branch that did not contribute > anything to the end result when looking for a needle with either -S or -G, > but doesn't the same logic apply to things like --grep? Yes, that's true. I have to admit that I'm mostly reacting to the unintuitive output given in the specific case ("test driven") and probably don't fully understand the underlying problem and all its implications. > I do not think it is a good idea to unconditionally disable simplification > for -S/G without a way for the user to countermand (even though I could be > persuaded to say that the flipping the default for -S/-G/--grep might have > been a better alternative in hindsight). Currently there is no way to turn simplification off, resulting in certain commits to become invisible when using e.g. -S in combination with path specs. > The user can control this behaviour by giving or not giving --simplify > from the command line anyway, no? Yes, but that goes only so far (see the examples in the parent post which use --full-history; --simplify-merges gives 3 more results with -m, but still not the full 160). And as a user I don't want to have to add another option in order to use pickaxe with path specs. My expectation is that my search has precedence over any history simplification, which is a nice and necessary optimization, but shouldn't hide any search results that I would have got if I had used no path specs. However, it definitely looks like a corner case and I still don't know what happened with all these merges. > As to the first hunk, I have no idea why this is a good change. I didn't see any other way to fix the example given in the commit message.. > It feels as if you are fighting against what this part of the code does in > try_to_simplify_commit(): > > switch (rev_compare_tree(revs, p, commit)) { > case REV_TREE_SAME: > tree_same = 1; > if (!revs->simplify_history || (p->object.flags & UNINTERESTING)) { > /* ... */ > pp = &parent->next; > continue; > } > parent->next = NULL; > commit->parents = parent; > commit->object.flags |= TREESAME; > return; > ... > > When we see a _single_ parent that has the same tree, we set tree_same and > also cull the parent list to contain only that parent. When we are not > simplifying the history, we do not cull the parent list and will inspect > other parents as well, but still we set tree_same to 1 here. When some > other parent is found to be different, we set tree_changed to 1. So we > have four states (same = (0, 1) x changed = (0, 1)). > > The code before your addition in the first hunk says that we keep the > commit if there is no parent with the same contents (i.e. !tree_same) and > there is at least one parent with different contents (i.e. tree_changed). > I suspect that this logic may not be working well when we do not simplify > the merge. > > Let's look at the original code before your patch again. > > 1. If all the parents of a commit are the same, we will see (tree_same && > !tree_changed), so we get TREESAME. > > 2. If some but not all of the parents are the same, we will see (tree_same > && tree_changed), and we end up getting TREESAME. > > 3. If none of the parents is the same, (!tree_same && tree_changed) holds > true, and we do not get TREESAME. For completeness, a fourth case (!tree_same && !tree_changed), which would be triggered by commits whose parents are all classified as REV_TREE_NEW. That's another corner case for sure, but the old code would mark it TREESAME and your patch changes that. > Perhaps the second condition needs to be tweaked for the "do not simplify > merges" case? That is, we split 2. into two cases: > > 2a. When simplifying the merges, if any of the parents is the same as the > commit, we say TREESAME (the same as before); > > 2b. When not simplifying, we say TREESAME only when all the parents are > the same as the commit. Otherwise the merge commit itself is worth > showing, i.e. !TREESAME. > > But I probably am missing some corner cases you saw in your analysis... > > diff --git a/revision.c b/revision.c > index 7b9eaef..0147124 100644 > --- a/revision.c > +++ b/revision.c > @@ -439,7 +439,7 @@ static void try_to_simplify_commit(struct rev_info *revs, struct commit *commit) > } > die("bad tree compare for commit %s", sha1_to_hex(commit->object.sha1)); > } > - if (tree_changed && !tree_same) > + if ((!revs->simplify_history && tree_changed) || !tree_same) > return; > commit->object.flags |= TREESAME; > } The patch lists the right commits in the test case, but requires the option --full-history to be given. Without it no output is given if the full file name is specified, as in master. Perhaps we should check my underlying assumption first: is it reasonable to expect a git log command to show the same commits with and without a path spec that covers all changed files? Renà -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html