Re: [PATCH v3 4/5] archive-zip: support archives bigger than 4GB

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 26.04.2017 um 23:02 schrieb Peter Krefting:
René Scharfe:

I struggled with that sentence as well. There is no explicit "format" field AFAICS.

Exactly. I interpret that as it is in zip64 format if there are any zip64 structures in the archive (especially if there is a zip64 end of central directory locator).

The crucial point is that I think the choice is per entry, i.e. if we
had to write a zip64 record for one file we can still emit a legacy
record for the next file that has a size of 0xffffffff.

Or in other words: A legacy ZIP archive and a ZIP64 archive can be bit-wise the same if all values for all entries fit into the legacy fields, but the difference in terms of the spec is what the archiver was allowed to do when it created them.

As long as all sizes are below (unsigned) -1, then they would be identical. If one, and only one, of the sizes are equal to (unsigned) -1 (and none overflow), then it is up to intepretation whether or not a ZIP64-aware archiver is allowed to output an archive that is not in ZIP64 format. If any single size or value overflows the 32 (16) bit values, then ZIP64 format is needed.

Sizes can be stored in zip64 entries even if they are lower (from a
paragraph about the data descriptor):

"4.3.9.2 When compressing files, compressed and uncompressed sizes
      should be stored in ZIP64 format (as 8 byte values) when a
      file's size exceeds 0xFFFFFFFF.   However ZIP64 format may be
      used regardless of the size of a file."

(But I don't see a benefit.)

    # 4-byte sizes, not ZIP64
    arch --format=zip ...

    # ZIP64, can use 8-byte sizes as needed
    arch --format=zip64 ...

Makes sense?

Well, I would say that it would be a lot easier to always emit zip64 archives. An old-style unzipper should be able to read them anyway if there are no overflowing fields, right? And, besides, who in 2017 has an unzip tool that is unable to read zip64? Info-Zip UnZip has supported Zip64 since 2009.

Windows XP.  Don't laugh. ;)

If you write zip64 extras for all size records then an old extractor
will only see the value 0xffffffff in them and ignore the zip64 part --
or ignore the entries outright.

Writing zip64 records only as needed saves space -- and that's what
zipping is all about, isn't it?

Adding unnecessary zip64 records would produce different ZIP files than
earlier version of git archive.  That's not a strong argument as changes
to libz can potentially do the same, but it still might affect someone
who caches generated ZIP files.

What I sent matches the behavior of InfoZIP zip (modulo bugs).  Why not
follow their lead?

(And one of the bugs in my patches not setting the version field to 45
as you pointed out earlier already.  InfoZIP may forget to do that if it
uses a zip64 extra for recording the offset, but it does set the version
correctly for files bigger than 4GB.)

What do other archivers do?

But I think a more important question is: Can the generated files
be extracted by popular tools (most importantly Windows' built-in
functionality, I guess)?

René



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]