Re: Using clean/smudge filters with difftool

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jun 18, 2015 at 01:00:36PM -0700, Junio C Hamano wrote:
> John Keeping <john@xxxxxxxxxxxxx> writes:
> 
> > I think this is a difference between git-diff's internal and external
> > diff modes which is working correctly, although possibly not desirably
> > in this case.  The internal diff always uses clean files (so it runs the
> > working tree file through the "clean" filter before applying the diff
> > algorithm) but the external diff uses the working tree file so it
> > applies the "smudge" filter to any blobs that it needs to checkout.
> >
> > Commit 4e218f5 (Smudge the files fed to external diff and textconv,
> > 2009-03-21) was the source of this behaviour.
> 
> The fundamental design to use smudged version when interacting with
> external programs actually predates that particular commit, I think.
> 
> The caller of the function that was updated by that commit, i.e.
> prepare_temp_file(), reuses what is checked out to the working tree
> when we can (i.e. it hasn't been modified from what we think is
> checked out) and when it is beneficial to do so (i.e. on a system
> with FAST_WORKING_DIRECTORY defined), which means the temporary file
> given by the prepare_temp_file() that is used by the external tools
> (both --ext-diff program and textconv filter) are designed to be fed
> and work on the smudged version of the file.  4e218f5 did not change
> that fundamental design; it just made things more consistent between
> the case where we do create a new temporary file out of blob and we
> allow an unmodified checked out file to be reused.

When I started looking at this, I assumed the problem would be that
git-difftool wasn't smudging the non-working-tree files.  But actually
everything is working "correctly", I'm just not sure it's always what
the user wants (at least it isn't what was wanted in this case).

Currently, the behaviour is:

	internal diff: compare clean files
	external diff: compare smudged files

This makes sense for LF/CRLF conversion, where platform-specific tools
clearly want the platform's line ending but the internal diff machinery
doesn't care.

However, from the filter description in an earlier email, I think
Florian is using a clean filter to remove output from IPython notebook
files (it seems that IPython saves both the input and the output in the
same file [1] and the output is the equivalent of, for example, C object
files).  In this case, the filter is one-way and discards information
from the working tree file, producing a smaller and more readable diff
in the process.

I think the summary is that there are some scenarios where the external
diff tool should see the smudged version and others where the clean
version is more appropriate and Git should support both options.  It
seems this is a property of the filter, so I wonder if the best solution
is a new "filter.<name>.extdiff = [clean|smudge]" configuration
variable (there's probably a better name for the variable than
"extdiff").


[1] http://pascalbugnion.net/blog/ipython-notebooks-and-git.html
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]