On Sat, Jan 23, 2021 at 01:05:00AM +0000, brian m. carlson wrote: > > Right now, "git archive" operations are bit-for-bit identical across all > > versions going back at least 8+ years. In fact, we've been relying on this to > > support bundling tarball signatures with git tags themselves (via git notes). > > E.g. you can see this in action here: > > https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tag/?h=v5.10.9 > > > > If you click on "(sig)", you will download a signature that can be used to > > verify the tarball generated using "git archive". > > Please do not rely on this behavior. I want to state in the strongest > possible terms that this is not guaranteed behavior and it may change at > any time. We have explicitly said so on the list multiple times. If > you need reproducible archives, you need to add a tool to canonicalize > them in a suitable format and not rely on Git to never change things. I strongly second this. :) It's also not quite true that things have remained bit-for-bit identical for all that time. We have fixed bugs in that time, although they do not always cause a change in every output tarball (they often depend on corner cases like having long pathnames). Two off the top of my head (that have indeed caused people to complain about changing checksums): - 22f0dcd963 (archive-tar: split long paths more carefully, 2013-01-05) - 82a46af13e (archive-tar: fix pax extended header length calculation, 2019-08-17) We also rely on system gzip. That's pretty stable, but I have heard tell that even `gzip -n` may differ on platforms. Another fun one I saw recently: using export-subst with $Format:%h$ will produce different results depending on how many objects are present in the repository running git-archive. -Peff