Re: [PATCH v7 0/5] git log -L, all new and shiny

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Junio C Hamano <gitster@xxxxxxxxx> writes:

> Thomas Rast <trast@xxxxxxxxxxxxxxx> writes:
>
>> This would be the first backwards coupling between the revision-walk and
>> the diff generation parts, at least that I know of.
>
> I am not convinced if you need to have any unusual back-coupling to
> begin with, by the way.
>
> If you say "git log -p [--options] -- pathspec", the revision
> machinery does filter commits that do not touch any paths that patch
> pathspec with the TREESAME logic, but that does not necessarily mean
> you will see _all_ the commits that are not TREESAME.

Let's clarify this part a bit.

Imagine you have three commits on a single strand of pearls.

	A---B---C

Between A and B, you only corrected indentation errors in file F as
a clean-up step, and then between B and C, you did a real change.
"git diff A B -- F" gives changes, "git diff -w A B -- F" does not.
Both "git diff B C -- F" and "git diff -w B C -- F" give you some
output.

Now, "git log --no-root -p -w C -- F" will give you C but not B.

Let's see how that happens.

The revision machinery looks at C and finds its parent B.  It runs
object level tree comparison and finds that their trees are
different at path F.  It makes a mental note that it may need to
show the log message of C, and asks the diff machinery to run
diff-tree between B and C.  The diff machinery finds that it needs
to show something even in the presense of -w option by actual
comparison, and just before showing the very first line of patch
output, it shows the log message of C (due to the earlier "mental
note").

Then the revision machinery looks at B.  It does the same between B
and A, but this time around, the diff machinery finds that, even
though A and B were _not_ TREESAME at the revision traversal level,
there is nothing to be shown after filtering with the -w option.
Hence no patch is shown and log message for B is not shown, either.

The interesting part of the above is that we did not bother using
the -w comparison to affect TREESAME check at the revision traversal
level.

I consider your "-L bottom,top" essentially the same "two blobs may
be different at the object name level, but after filtering the diff,
there may be no output" filter just like the "-w" option is.
Instead of filtering hunks that have changes only in whitespaces, it
filters hunks that do not overlap the given line range.  So if you
replace "-w" in the above example with "-L bottom,top", exactly the
same thing should happen.

The current code structure does not consider the content level
filtering done by the "-w" when computing TREESAME.  This DOES
affect how the merges are simplified.  If you and I start from a
commit X with full of indentation mistakes, you fix the earlier half
of these mistakes and make commit T while I fix the later half of
these mistakes and make commit J, and we merge our results into
merge M:

      J---M
     /   /
    X---T

none of "diff -w J M", "diff -w T M", "diff -w X J", "diff -w X T"
will output anything.  "git diff -p -w M" will however traverse both
branches, even though in this example nothing will be shown in the
output, because no two trees in these four commits are TREESAME.  

In the longer term, this _may_ be something we want to fix, but
teaching the revision machinery about only "-L bottom,top" is not
the way to do it.  Instead, we should devise a new mechanism to call
into the diff_patch machinery from the place where the revision
machinery computes TREESAME, and let it ask the filtered content
level differences if any of these options (these flags flip the
DIFF_FROM_CONTENTS diff option) are in use.  As a part of the
implementation of that new mechanism, we may want to cache the patch
such an early comparison produces, and reuse it when we actually
produce output, as an optimization.  But I think that is not limited
to the "-L bottom,top" option and an orthogonal issue, as it will
work equally well for existing "-w" filter (there may be others).

The above is where my recommendations are coming from.  The first
step for "-L bottom,top" should be to hook it as a new kind of
DIFF_FROM_CONTENTS option that compares two non-identical blobs but
possibly yield empty result into the diff machinery at the same
level as "-w".  Once that is solidly done, we can think about
updating the merge simplification logic to take DIFF_FROM_CONTENTS
changes into account when it computes TREESAME.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]