Re: [RFC PATCH] checkout: Force matching mtime between files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



W dniu śro, 25.04.2018 o godzinie 11∶18 -0400, użytkownik Marc Branchaud
napisał:
> On 2018-04-25 04:48 AM, Junio C Hamano wrote:
> > "Robin H. Johnson" <robbat2@xxxxxxxxxx> writes:
> > 
> > > In the thread from 6 years ago, you asked about tar's behavior for
> > > mtimes. 'tar xf' restores mtimes from the tar archive, so relative
> > > ordering after restore would be the same, and would only rebuild if the
> > > original source happened to be dirty.
> > > 
> > > This behavior is already non-deterministic in Git, and would be improved
> > > by the patch.
> > 
> > But Git is not an archiver (tar), but is a source code control
> > system, so I do not think we should spend any extra cycles to
> > "improve" its behaviour wrt the relative ordering, at least for the
> > default case.  Only those who rely on having build artifact *and*
> > source should pay the runtime (and preferrably also the
> > maintainance) cost.
> 
> Anyone who uses "make" or some other mtime-based tool is affected by 
> this.  I agree that it's not "Everyone" but it sure is a lot of people.
> 
> Are we all that sure that the performance hit is that drastic?  After 
> all, we've just done write_entry().  Calling utime() at that point 
> should just hit the filesystem cache.
> 
> > The best approach to do so is to have those people do the "touch"
> > thing in their own post-checkout hook.  People who use Git as the
> > source control system won't have to pay runtime cost of doing the
> > touch thing, and we do not have to maintain such a hook script.
> > Only those who use the "feature" would.
> 
> The post-checkout hook approach is not exactly straightforward.
> 
> Naively, it's simply
> 
> 	for F in `git diff --name-only $1 $2`; do touch "$F"; done
> 
> But consider:
> 
> * Symlinks can cause the wrong file to be touched.  (Granted, Michał's 
> proposed patch also doesn't deal with symlinks.)  Let's assume that a 
> hook can be crafted will all possible sophistication.  There are still 
> some fundamental problems:
> 
> * In a "file checkout" ("git checkout -- path/to/file"), $1 and $2 are 
> identical so the above loop does nothing.  Offhand I'm not even sure how 
> a hook might get the right files in this case.
> 
> * The hook has to be set up in every repo and submodule (at least until 
> something like Ævar's experiments come to fruition).
> 
> * A fresh clone can't run the hook.  This is especially important when 
> dealing with submodules.  (In one case where we were bit by this, make 
> though that half of a fresh submodule clone's files were stale, and 
> decided to re-autoconf the entire thing.)
> 
> 
> I just don't think the hook approach can completely solve the problem.
> 

There's also the performance aspect.  If we deal with checkouts that
include 1000+ files on a busy system (i.e. when mtimes really become
relevant), calling utime() instantly has a good chance of hitting warm
cache.  On the other hand, post-checkout hook has a greater risk of
running cold cache, i.e. writing to all inodes twice.

-- 
Best regards,
Michał Górny




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux