Re: File owner/group and git

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



David Turner <dturner@xxxxxxxxxxxxxxxx> writes:

> In unpack-trees.c, in verify_uptodate_1, we check ie_match_stat.  This
> returns OWNER_CHANGED if a file has changed ownership since the index
> was updated.  Do we actually care about that particular case?  Or really
> anything other than DATA_CHANGED?

That's a 10-year old code and there aren't that many people left
who can answer the original rationale, I am afraid ;-)

In general, "Do we actually care?" is not the question we ask in
this area of the code.  "Does it help us to catch real changes, or
does it change spuriously to make it too unreliable a signal to be
useful?" is the question that drives the design of this part of the
system.

DATA_CHANGED is "we know the contents are different without even
looking at the data".  If the size is different from the last time
we hashed the data, the contents must have changed.  The inverse is
not true (and that is half of the "racy git" issue).

Other *_CHANGED are finely classified only because originally we
didn't really know which are useful to treat as notable change
event, and "changed" variable had sufficient number of bits to hold
different classification, so that we could pick and choose which
ones we truly care.  We knew MTIME was useful in the sense that even
if the size is the same, updated mtime is good enough indication
that the stuff has changed, even to "make" utility.

INODE and CTIME are not so stable on some filesystems (e.g. inum may
not be stable on a network share across remount) and in some
environments (e.g. some virus scanners touch ctime to mark scanned
files, cf. 1ce4790b), and would trigger false positives too often to
be useful.  We always paid attention to them initially, but there
are configurations to tell Git not raise them these days.

OWNER probably falls into a category that is stable enough to be
useful, as the most likely way for it to change is not by running
"chown" on the file in-place (which does not change the contents),
but by running "mv" to drop another file owned by somebody else to
the original location (which likely does change the contents).  At
the same time, "mv" a different file into the path would likely
trigger changes to INODE and MTIME as well, so it cannot be more
than belt-and-suspenders measure to catch modification.  In that
sense ignoring OWNER would not hurt too much.

If it changes spuriously to make it too unreliable a signal to be
useful, it certainly is OK to introduce a knob to ignore it.  It
might even make sense to ignore it unconditionally if the false hit
happens too frequently, but offhand my gut reaction is that there
may be something wrong in the environment (i.e. system outside Git
in which Git runs) if owner/group changes spuriously to cause
issues.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]