Re: Consist timestamps within a checkout/clone

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 31 Oct 2022, Taylor Blau wrote:

> On Mon, Oct 31, 2022 at 09:21:20PM +0100, Ævar Arnfjörð Bjarmason wrote:
> > I think you're almost certainly running into the parallel checkout,
> > which is new in that revision range. Try tweaking checkout.workers and
> > checkout.thresholdForParallelism (see "man git-config").
> >
> > I can't say without looking at the code/Makefile (and even then, I don't
> > have time to dig here:), but if I had to bet I'd say that your
> > dependencies have probably always been broken with these checked-in
> > files, but they happend to work out if they were checked out in sorted
> > order.
> >
> > And now with the parallel checkout they're not guaranteed to do that, as
> > some workers will "race ahead" and finish in an unpredictable order.
> 
> Doesn't checkout.thresholdForParallelism only matter when
> checkout.workers != 1?
> 
> So what you wrote seems like a reasonable explanation, but only if the
> original reporter set checkout.workers to imply the non-sequential
> behavior in the first place.
> 
> That said...
> 
>   - I also don't know off-hand of a place where we've defined the order
>     where Git will checkout files in the working copy. So depending on
>     that behavior isn't a safe thing to do.
> 
>   - Committing build artifacts into your repository is generally
>     discouraged.

If it's undefined and never implemented this is reasonable.

But "generally" is a caveat, so while I agree with the statement it also 
implies there's valid cases outside of that. Ones which used to work, too.

Here are some useful cases I have seen for the combination of build rule + 
checked in file:

- part of a build requires licensed software that's not always available

- part of the build requires large memory that other builders generally do 
  not have available

- part of the build process uses a different platform or some other system 
  requirement

- to fetch data eg. from a URL, with a record of the URL/automation but 
  also a copy of the file as a record and for offline use

So it's useful, to retain repeatable automation but not always build from 
square one.

Generally discouraged to check in build results yes, but I've found it 
very practical.
 
> So while I'd guess that setting `checkout.workers` back to "1" (if it 
> wasn't already) will probably restore the existing behavior, counting on 
> that behavior in the first place is wrong.

I think perhaps the tail is wagging the dog here, though.

It's 'wrong' because it doesn't work; but I haven't seen anything to make 
me think this is fundamentally or theoretically flawed.

If we had a transactional file system we'd reasonably expect a checkout to 
be an atomic operation -- same timestamp on the files created in that 
step. A discrepancy in timestamps would be considered incorrect; it would 
imply an 'order' to the checkout which, as you say, is order-less.

Sowhat could be the bad outcomes if Git created files stamped with the 
point in time of the "git checkout"?

> Thanks,
> Taylor
> 
> 

-- 
Mark

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux