Re: Reminder: GitHub etc. auto-generated archives are not stable in time

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, May 25, 2017 at 12:22:31PM +0200, Jan Pokorný wrote:
> On 25/05/17 00:28 +0000, Zbigniew Jędrzejewski-Szmek wrote:
> > On Wed, May 24, 2017 at 05:22:43PM +0200, Jan Pokorný wrote:
> >> On 24/05/17 09:58 -0500, Jorge Gallegos wrote:
> >>> On Wed, May 24, 2017 at 04:19:05PM +0200, Jan Pokorný wrote:
> >>>> today, I've accidentally attested there are no stability guarantees
> >>>> with the on-demand archives from common git hosting sites when preparing
> >>>> a new pacemaker update, redownloading "spectool -s 0 pacemaker.spec"
> >>>> of the original (-0.1.rc1, from 2 weeks ago) spec and comparing the
> >>> 
> >>> Are you pointing to a _tag_ (or as github likes to call them: release) ?
> >>> As far as I know tags can be re-created, isn't that what is happening
> >>> here?
> >> 
> >> Nope, the point is that nothing has changed in the codebase or, for
> >> that matter, tags.  It must have been GitHub that changed how its
> >> equivalent of "git archive" behaves.
> >> 
> >>>> hashes, which (surprisingly to me) didn't match (they were at any similar
> >>>> test in the past).  Then I looked at the adiff output:
> >>>> 
> >>>>> diff -ru Unpack-2241/pacemaker-Pacemaker-1.1.17-rc1/configure.ac Unpack-6255/pacemaker-Pacemaker-1.1.17-rc1/configure.ac
> >>>>> --- Unpack-2241/pacemaker-Pacemaker-1.1.17-rc1/configure.ac2017-05-09 00:55:15.000000000 +0200
> >>>>> +++ Unpack-6255/pacemaker-Pacemaker-1.1.17-rc1/configure.ac2017-05-09 00:55:15.000000000 +0200
> >>>>> @@ -1159,7 +1159,7 @@
> >>>>>  AC_PATH_PROGS(GIT, git false)
> >>>>>  AC_MSG_CHECKING(build version)
> >>>>>  
> >>>>> -BUILD_VERSION=0459f40
> >>>>> +BUILD_VERSION=0459f40958
> >>>>>  if test  != ":%h$"; then
> >>>>>     AC_MSG_RESULT(archive hash: )
> >>>>  
> >>>> for configure.ac that indeed has export-subst git attribute set
> >>>> and the change itself arises from "$Format:%h$" substitution.
> >>>> This likely means GitHub was internally updated to use equivalent
> >>>> of git 2.11 feature of abbreviation length autoscaling within
> >>>> last 14 days.
> >>> 
> >>> This is the other bit that makes me think it was actually the
> >>> maintainers hand that moved this, I don't believe github does anything
> >>> special to the code once it's stored there. There is no way for github
> >>> to alter code afaik?
> >> 
> >> Once more, see "git help archive", ATTRIBUTES section, export-subst
> >> in particular.  That exactly stands for the varying part, which is
> >> implementation-specific, and GH implementation has apparently changed,
> >> leading to changed contents of numerous archives to be downloaded from
> >> that very point.
> > 
> > Well, the title of your mail implies that *any* archive changed.
> 
> Of course, any such archives can change bitwise in between two
> arbitrary moments, simply because some variable in the archiving
> process can change (is even file list linearization assuredly
> stable?).  I've left that, obvious for some, aspect aside in the email
> body, because a changed content is what seems to be entirely new,
> at least to me.  It means that also alternative checksuming approaches
> (zcat archive | sha512sum) are disqualified from the attempts to
> deal with such instabilities.

So far that order has been stable. I'm sure a lot of people would
be quite unhappy if it suddenly changed.

> > What changed in fact is an archive with %h subst. But %h is
> > inherently unstable: when commits are added to the archive, git will
> > extend the generated abbreviation length to maintain uniqueness.
> 
> This is how the original git implementation was changed recently,
> but it doesn't mean any implementation has to behave that way,
> and who knows what these proprietary services use.

My point was that %h length will change even under the same exact
git implementation, as the number of commits in the repo grows larger
over time. In particular you can always generate a hash collision on
an unrelated branch, push that to the repo, and git will extend the
original %h.

> > I think the error here is in relying on stability %h substitution
> > over time.
> 
> I'd say more broadly that the error is to rely on stability of
> auto-generated archives as such.  In that light, Colin's git-evtag
> project looks appealing for upstreams that do not provide their own
> stable tarballs.

The github thing is very convenient. I wouldn't discount it, before
actual problems appear.

Zbyszek
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Fedora Announce]     [Fedora Kernel]     [Fedora Testing]     [Fedora Formulas]     [Fedora PHP Devel]     [Kernel Development]     [Fedora Legacy]     [Fedora Maintainers]     [Fedora Desktop]     [PAM]     [Red Hat Development]     [Gimp]     [Yosemite News]
  Powered by Linux