Re: [PATCH v7 0/5] git log -L, all new and shiny

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Junio C Hamano <gitster@xxxxxxxxx> writes:

> Thomas Rast <trast@xxxxxxxxxxxxxxx> writes:
>
>> I too thought it would never happen -- but then again this is still
>> not ready, I'm just trying to give it some exposure.
>> ...
>> There's also a longer-term wishlist hinted at in the commit message of
>> the main patch: the diff machinery currently makes no provisions for
>> chaining its various bells and whistles.
>
> I am not convinced that it is "diff machinery makes no provivsions"
> that is the problem. Isn't it coming from the way the series limits
> the output line range and reimplements its own output routine?

Well, in a very circular logic sense, yes: I reimplement the output
routine because that's the only way I could think of doing it right now :-)

However, notice that word-diff also reimplements its own output routine,
though it probably has a better standing since it is a different format.

>  - add a mechanism to pass the "interesting" line range and path
>    down to the callchain from xdi_diff_outf() to xdiff_outf();
>
>  - make one of these functions filter out (i.e. not call the
>    callback xdiff_emit_consume_fn) hunks that do not overlap with
>    the line range you are interested in (I would presume that they
>    would be a few new fields in xdemitconf_t structure); and
>
>  - while recording the corresponding line ranges in the other side
>    of the hunks that are output,

Hrm.

This would be the first backwards coupling between the revision-walk and
the diff generation parts, at least that I know of.  Normally the
revision walker just calls out to the (line-wise, not tree-based) diff
engine when it wants to show a commit.  Now suddenly the diff engine is
used (a lot, too) in simplifying the history.

Ideally we would want to reuse diffs that have already been generated,
as this is a very expensive process.  The current log -L implementation
manages to do this at the cost of reimplementing the diff output
routines instead.

You solve it instead by mandating that the diff engine itself updates
the "interesting" ranges, but that needs a lot of inside knowledge: like
in blame, we sometimes explore alternatives (e.g. for merges; or with
-M, though log -L in this version does not implement that feature).

So we would end up with redoing diffs, or a very tight coupling, that
IMHO just makes the mess worse.

Or am I missing something?

I instead have the vision that eventually diffs should be represented
internally as something like my pairs of struct range_set.  Then we
could run more passes on them as needed, and have a "common currency"
between all diff-related work.  Only the last one should then actually
output the diff.

That still doesn't properly account for the case where the data format
is no longer in terms of hunks (such as for word-diff, or the stat
formats), though.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]