Re: interaction between git-diff-index and git-apply

Junio C Hamano <gitster@xxxxxxxxx> · Sun, 22 Jan 2017 14:06:45 -0800

Ariel Davis <ariel.z.davis@xxxxxxxxxx> writes:

> I have noticed an interesting interaction between git-diff-index and git-apply.
> Essentially, it seems that if we start with a clean working tree, then
> git-apply a patch, then git-apply the reverse of that patch, git-diff-index
> still thinks files are modified. But then, if we git-status, git-diff-index
> seems to "realize" the files are actually not modified.

That is perfectly normal and you are making it too complex.  You do
not need to involve "git apply" at all.

    $ git init
    $ echo hello >file && git add file && git commit -m initial
    $ git diff-index HEAD
    $ echo hello >file
    $ git diff-index HEAD
    :100644 100644 ce013625030ba8dba906f756967f9e9ca394464a 0000000000000000000000000000000000000000 M	file

    $ git update-index --refresh
    $ git diff-index HEAD
    $ exit

A few things about Git that are involved in the above are:

 * There are plumbing commands that are geared more towards
   scripting Git efficiently and there are end-user facing Porcelain
   commands.

 * Git uses cached "(l)stat" information to avoid having to inspect
   the contents of the file all the time.  The idea is that Git
   remembers certain attributes (like size and last modified time)
   of a file when the contents of the file and the blob object in
   the index are the same (e.g. when you did "git add file" in the
   above sequence), and it can tell a file was edited/modified if
   these attributes are different from those recorded in the index
   without comparing the contents of the file with the blob.

 * The plumbing commands trust the cached "(l)stat" information for
   efficiency and whoever uses the plumbing commands are responsible
   for culling the false positives when cached "(l)stat" information
   is used to see which paths are modified.  They (typically these
   are scripted commands) do so with "update-index --refresh".

 * The Porcelain commands sacrifice the efficiency and internally do
   an equivalent of "update-index --refresh" at the beginning to
   hide the false positive.

After your reverse application of the patch with "git apply" (or the
second "echo hello" into the file in the above example), the
contents of the file is equivalent to what is in the index, but the
last modified timestamp (among others) is different because you
wrote into the file.  If you do not do "update-index --refresh"
before running "diff-index" again, "diff-index" will notice and
report the fact that you touched that file.  If you run "git status",
"git diff", etc., they internally do "update-index --refresh" and
then after that until you touch the file on the filesystem, the
cached "(l)stat" information will match and you will stop seeing the
false positive.