Re: [PATCH v2 2/3] builtin: introduce diff-pairs command

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 25/02/19 03:47PM, Junio C Hamano wrote:
> Junio C Hamano <gitster@xxxxxxxxx> writes:
> 
> >> Regarding pathspec support, being that git-diff-pairs(1) operates solely
> >> on the provided set of file pairs produced via some other Git operation,
> >> I don't think further limiting would provide much additional value
> >> either. If we do want this though, I think support could be added in the
> >> future.
> >
> > Another consideration is which side of the pipeline should take the
> > responsibility to invoke the diffcore machinery.  We certainly could
> > make it the job for the upstream/frontend, in which case diff-pairs
> > does not have to call into diffcore-rename, BUT it also means the
> > downstream/backend needs to be able to parse two paths (renamed from
> > and renamed to).  Or we could make it the job for the downstream,
> > and forbid the upstream/frontend from feeding renamed pairs (i.e.
> > any input with status letter R or C are invalid), in which case
> > diff-pairs can choose to invoke rename detection or not by paying
> > attention to the -M option and invoking diffcore_rename() itself
> > (which should be at no-cost from coding point of view, as it should
> > be just the matter of calling diffcore_std()).

In the current implementation, diff-pairs is capable of handling input
containing rename/copy filepairs computed upstream. It does so by
parsing the input line and manually setting the status, score, and paths
for the queued `diff_filepair`.

I think diff-pairs should support rename and copy input as it would
allow for rename/copy detection to be performed upfront in a single
pass by the upstream and the resulting output could be split up and fed
to separate downstream diff-pairs. This is particularly useful for
server-side diffs to break up what would be large diffs.

> Sorry, but I hit <SEND> too early before finishing the most
> important part.  We can move the features between upstream frontends
> and downstream diff-pairs.  Depending on our goals, the best
> division of labor would be different.  If we want to make it easy
> for people to write their custom frontends, for example, it might
> make sense to allow them to be as stupid and simple as possible and
> make all the heavy lifting the responsibility of the diff-pairs
> backend, which is the shared resource these frontends share and rely
> on (so that they have incentive to help us make sure diff-pairs will
> stay correct and performant).  If on the other hand we want to allow
> people to do fancy processing in their custom frontends, maybe keeping
> diff-pairs as stupid and transparent would be a better option to give
> the people who write upstream/frontends more predictable behaviour.
> 
> Where to do the pathspec limiting is one of these things.  You could
> make it responsibility for the frontends if we assume that frontends
> must do their own limiting.  Or you could make it an optional feature
> of the backends, so that frontends that does not do its own limiting
> can ask diff-pairs to limit.  Which side to burden more really depends
> on whose job we are trying to make it easier.

For the server-side diff usecase, I think that aligns more towards
having a front-end that does more of the heavy lifting such rename/copy
detection and pathspec limiting, while the diff-pairs really just needs
to compute the individual diffs for the already specified file pairs.

I do see value though in keeping the door open for diff-pairs to become
more robust and flexible. Maybe it would be fine for now to say pathspec
limiting is not supported, but it could be in the future?

> >> The tree objects in the input are not expanded. With `git diff-pairs
> >> --raw` these objects are just printed again. With the `--patch` option,
> >> they are just ommitted.
> 
> >Instead of getting expanded into its subpaths?

The current implementation of diff-pairs is rather simple. It relies on
the upstream to feed it the file pairs with all the info upfront so it
can setup the diff queue. This means input with tree objects is also
queued as-is without being expanded further. I could maybe see a future
though where we want diff-pairs to be a more robust backend and supports
expanding these paths via -r option. Following previous discussion,
maybe it's fine to keep the initial implementation of diff-pairs on the
simple side for now. We could make diff-pairs die() for now if the -r
option is explicitly set.

Thanks
-Justin




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux