Re: [PATCH v3 4/5] archive-zip: support archives bigger than 4GB

René Scharfe <l.s.r@xxxxxx> · Tue, 25 Apr 2017 18:24:47 +0200

Am 25.04.2017 um 09:55 schrieb Peter Krefting:
René Scharfe:

This needs to be >=. The spec says that if the value is 0xffffffff, 
there should be a zip64 record with the actual size (even if it is 
0xffffffff).
Could you please cite the relevant part?

4.4.8 compressed size: (4 bytes)
4.4.9 uncompressed size: (4 bytes)

"If an archive is in ZIP64 format and the value in this field is 
0xFFFFFFFF, the size will be in the corresponding 8 byte ZIP64 extended 
information extra field."

Of course, there is no definition of how they define that "an archive is 
in ZIP64 format", but I would say that is whenever it has any ZIP64 
structures.

I struggled with that sentence as well.  There is no explicit "format"
field AFAICS.  The closest at the archive level are zip64 end of central
directory record and locator.  But what really matters is the presence
of a zip64 extended information extra field to hold the 64-bit size
value.

There's also this general note a bit higher up:

      "4.4.1.4  If one of the fields in the end of central directory
      record is too small to hold required data, the field should be
      set to -1 (0xFFFF or 0xFFFFFFFF) and the ZIP64 format record
      should be created."

My interpretation: An archiver that can only emit 32-bit ZIP files
(either because it doesn't support ZIP64 or due to a compatibility
option set by the user) writes 32-bit size fields and has no defined way
to deal with overflows.  An archiver that is allowed to use ZIP64 can
emit zip64 extras as needed.

Or in other words: A legacy ZIP archive and a ZIP64 archive can be
bit-wise the same if all values for all entries fit into the legacy
fields, but the difference in terms of the spec is what the archiver was
allowed to do when it created them.

	# 4-byte sizes, not ZIP64
	arch --format=zip ...

	# ZIP64, can use 8-byte sizes as needed
	arch --format=zip64 ...

Makes sense?

René