Hi,
We noticed some unexpected behavior of git, when running git commands under
fakeroot, and then running another command without fakeroot.
When running, e.g., git status, or git describe --dirty, git can
update the index file.
This can happen inadvertently by, e.g., running the make install process of
a package under fakeroot, and running "git describe --dirty", or "git
status" to generate package name.
The unexpected result is:
"fakeroot git status" updates the index, and the index now says all files
are owned by uid:0.
"git diff-index --name-only HEAD" is used to test if the git tree is dirty
without fakeroot, concluding all files have changed, since their owner UID
is changed.
You can easily recreate it:
$ mkdir gitexample;cd gitexample;git init
Initialized empty Git repository in /home/elazar/gitexample/.git/
$ touch x; git add x; git commit -m init
1 file changed, 0 insertions(+), 0 deletions(-)
create mode 100644 x
$ git diff-index HEAD # nothing is printed, since working dir clean
$ fakeroot git diff # index is updated to contain uid:0
$ git ls-files -s --debug
100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0 x
ctime: 1512636533:915178157
mtime: 1512636533:915178157
dev: 64515 ino: 8523907
uid: 0 gid: 0
size: 0 flags: 0
$ git diff-index --name-only HEAD # x is considered different now
x
Make sure you do not have a shell prompt that calls git status in
between, like oh-my-zsh, or bash-it.
This is problematic because:
1. File owner is not really tracked by git porcelain, and I'm not sure
what's the benefit of tracking ownership of files in git diff-index.
Users
who expect git diff to return "no changes" after ownership changed,
expect
git diff-index to return "no changes" as well.
This is supported by the manual page who say "Compares the content
and mode
of the blobs found in a tree object with the corresponding tracked
files in
the working tree", and doesn't mention file ownership.
2. Linux tree uses git diff-index to test if the working tree is dirty. It
can, and has cause subtle issues when building and packaging different
parts of the tree to different directories concurrently.
Our proposed solutions:
Make match_stat_data ignore ownership changes.
I searched git source for usage of OWNER_CHANGED bit and found that no one
tests for it. I can't think of a scenario where the fact that the owner have
changed would matter.
I searched for usage of git-diff-index in git tree, and found that
git-stash uses it. This is not relevant since stash would ignore ownership
changes. I saw that merge-octupus and filter-branch uses git-diff-index
but they also don't seem to care if ownership has changed.
I searched for the first commit who tracked file ownership in the index,
it was there from the first commit e83c5163316f89bfbde7d9ab23ca2e25604af290
I found no specific reason given for tracking the owner in related commits.
You can see a change in a similar spirit to what I propose in commit
2cb45e95438c113871fbbea5b4f629f9463034e7 where st_dev is ignored by default
due to similar problem in NFS.
What do you think?
Should the behavior remain, and diff-index should say it tracks "mode and
owner changes"?
Would it be OK to change the actual behavior of diff-index to ignore owner
changes?