Re: [PATCH 0/9] git archive: use gzip again by default, document output stabilty

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Ævar

On 02/02/2023 09:32, Ævar Arnfjörð Bjarmason wrote:
As reported in
https://lore.kernel.org/git/a812a664-67ea-c0ba-599f-cb79e2d96694@xxxxxxxxx/
changing the default "tgz" output method of from "gzip(1)" to our
internal "git archive gzip" (using zlib ) broke things for users in
the wild that assume that the "git archive" output is stable, most
notably GitHub: https://github.com/orgs/community/discussions/45830
>
Leaving aside the larger question of whether we're going to promise
output stability for "git archive" in general, the motivation for that
change was to have a working compression method on systems that lacked
a gzip(1).

As I recall the reduction in cpu time used to create a compressed archive was a factor in making it the default.

As the disruption of changing the default isn't worth it, let's use
gzip(1) again by default, and only fall back on the new "git archive
gzip" if it isn't available.

Playing devil's advocate for a moment as we're not going to promise that the compressed output of "git archive" will be stable in the future perhaps we should use this breakage as an opportunity to highlight that to users and to advertize the config setting that allows them to use gzip for compressing archives. Reverting the change gives the misleading impression that we're making a commitment to keeping the output stable. The focus of this thread seems to be the problems relating to github which they have already addressed.

I think there is general agreement that it is not practical to promise that the compressed output of "git archive" is stable so maybe it is better to make that clear now while users can work around it in the short term with a config setting rather than waiting until we're faced with some security or other issue that forces a change to the output which users cannot work around so easily.

Best Wishes

Phillip


The later parts of this series then document and test for the output
stability of the command.

We're not promising anything new there, except that we now promise
that we're going to use "gzip" as the default compressor, but that
it's up to that command to be stable, should the user desire output
stability.

The documentation discusses the various caveats involved, suggests
alternatives to checksumming compressed archives, but in the end notes
what's been the policy so far: We're not promising that the "tar"
output is going to be stable.

The early parts of this series (1-2/9) are clean-up for existing
config drift, as later in the series we'll otherwise need to change
the divergent config documentation in two places.

CI & branch for this at:
https://github.com/avar/git/tree/avar/archive-internal-gzip-not-the-default

Ævar Arnfjörð Bjarmason (9):
   archive & tar config docs: de-duplicate configuration section
   git config docs: document "tar.<format>.{command,remote}"
   archiver API: make the "flags" in "struct archiver" an enum
   archive: omit the shell for built-in "command" filters
   archive-tar.c: move internal gzip implementation to a function
   archive: use "gzip -cn" for stability, not "git archive gzip"
   test-lib.sh: add a lazy GZIP prerequisite
   archive tests: test for "gzip -cn" and "git archive gzip" stability
   git archive docs: document output non-stability

  Documentation/config/tar.txt           | 29 +++++++-
  Documentation/git-archive.txt          | 96 +++++++++++++++++++-------
  archive-tar.c                          | 78 ++++++++++++++-------
  archive.h                              | 11 +--
  t/t5000-tar-tree.sh                    |  2 -
  t/t5005-archive-stability.sh           | 70 +++++++++++++++++++
  t/t5562-http-backend-content-length.sh |  2 -
  t/test-lib.sh                          |  4 ++
  8 files changed, 231 insertions(+), 61 deletions(-)
  create mode 100755 t/t5005-archive-stability.sh




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux