On Mon, Jun 17, 2013 at 6:18 PM, Junio C Hamano <gitster@xxxxxxxxx> wrote: > Antoine Pelisse <apelisse@xxxxxxxxx> writes: > >> So here is a more thorough description of the option: > >> - real changes are interesting > > OK, I think I can understand it. > >> - blank lines that are close enough (less than context size) to >> interesting changes are considered interesting (recursive definition) > > OK. > >> - "context" lines are used around each hunk of interesting changes > > OK. > >> - If two hunks are separated by less than "inter-hunk-context", they >> will be merged into one. > > Makes sense. > >> The current implementation does the "interesting changes selection" in a >> single pass. > > "current" meaning "the code after this patch is applied"? Is there > a possible future enhancement hinted here? No. There might be, but I'm not sure it should be discussed right now (In case you're curious, I'm thinking about interaction with combined diff). I will take the hint and rephrase. >> +xdchange_t *xdl_get_hunk(xdchange_t **xscr, xdemitconf_t const *xecfg) >> +{ >> + xdchange_t *xch, *xchp, *lxch; >> long max_common = 2 * xecfg->ctxlen + xecfg->interhunkctxlen; >> + long max_ignorable = xecfg->ctxlen; >> + unsigned long changes = ULONG_MAX; Let me explain what "changes" means, as I know it will help the rest of the message: It counts the number of *added* blank lines we have ignored since "lxch" (needed to calculate the distance between lxch and xch) It also has the meaning of what was called "interesting" before. If changes == ULONG_MAX, we are still in interesting zone, otherwise it means we have ignored "changes" *added* blank lines (0 being a valid value). (Actually, After rereading this part, it looks like I could check that lxch == xchp rather than setting changes to ULONG_MAX). >> + >> + /* remove ignorable changes that are too far before other changes */ >> + for (xchp = *xscr; xchp && xchp->ignore; xchp = xchp->next) { >> + xch = xchp->next; >> + >> + if (xch == NULL || >> + xch->i1 - (xchp->i1 + xchp->chg1) >= max_ignorable) >> + *xscr = xch; >> + } > > This strips leading ignorable ones away until we see an unignorable > one. Looks sane. > >> + if (*xscr == NULL) >> + return NULL; >> + >> + lxch = *xscr; > > "lxch" remembers the last one that is "interesting". > >> + for (xchp = *xscr, xch = xchp->next; xch; xchp = xch, xch = xch->next) { >> + long distance = xch->i1 - (xchp->i1 + xchp->chg1); >> + if (distance > max_common) >> break; > > If we see large-enough gap, the one we processed last (in xchp) is > the end of the current hunk. Looks sane. > >> + if (distance < max_ignorable && >> + (!xch->ignore || changes == ULONG_MAX)) { >> + lxch = xch; >> + changes = ULONG_MAX; > > The current one is made into the "last interesting one we have seen" > and the hunk continues, if either (1) the current one is interesting > by itself, or (2) the last one we saw does not match some > unexplainable criteria to cause changes set to not ULONG_MAX. > > Puzzling. - If we are still in interesting zone, we take it, even if it's ignorable change. Because it's close enough. - Otherwise, only take real changes. We are close to another change, and we are still in the loop, so it must be interesting. >> + } else if (changes != ULONG_MAX && >> + xch->i1 + changes - (lxch->i1 + lxch->chg1) > max_common) { >> + break; > > If the last one we saw does not match some unexplainable criteria to > cause changes set to not ULONG_MAX, and the distance between this > one and the last "intersting" one is further than the context, this > one will not be a part of the current hunk. > > Puzzling. If we are no longer in "interesting zone" (changes != ULONG_MAX), it means we will stop if the distance is too big. "changes" is used in the calculation to consider the changes we have already ignored (xch->i1 - (lxch->i1 + lxch->chg1) will only work if xch and lxch are consecutive, we need to add the blank lines we ignored). > Could you add comment to the "changes" variable and explain what the > variable means? > >> + } else if (!xch->ignore) { >> + lxch = xch; >> + changes = ULONG_MAX; > > When this change by itself is interesting, it becomes the "last > interesting one" and the hunk continues. Exactly, and changes goes back to "interesting". >> + } else { >> + if (changes == ULONG_MAX) >> + changes = 0; >> + changes += xch->chg2; > > Puzzled beyond guessing. Also it is curious why here and only here > we look at chg2 side of the things, not i1/chg1 in this whole thing. chg2 being the number of blank line *additions*. I don't want to coalesce two hunks because some blank lines have been removed between the two, so we must not change the distance calculation because of a blank line removal. That behavior can be seen in "ignore-blank-lines: between changes" test. Hope that makes things clearer, Thanks again for the thorough reading, Antoine -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html