Re: git archive --format zip utf-8 issues

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 18.09.2012 23:12, schrieb Junio C Hamano:
René Scharfe <rene.scharfe@xxxxxxxxxxxxxx> writes:

                                          Windows    Info-ZIP unzip
                             7-Zip PeaZip builtin Linux msysgit Windows
7-Zip 9.20                      0      0      46    26      43      43
PeaZip 4.7.1 win64              0      0      46    26      42      42
Info-ZIP zip 3.0 Linux          0      0      72     0      43      43
Info-ZIP zip 3.0 Windows       45     45     n/a     0      43      43

It is kind of surprising that "Windows builtin" has very poor score
extracting from the output of Zip tools running on Windows (I am
looking at 46, 46 and n/a over there).  If you tell it to create an
archive from its disk and then extract from it, I wonder what would
happen.

I didn't include it as a packer because it refused to archive the pangrams directory due to illegal characters in one of the filenames. When I just tried a bit harder, I had to delete all but 14 files with Latin script, accents etc. before I could zip the directory. I'll include these results in the next round.

It uses codepage 850 on my system (MSDOS Latin 1). I don't expect this to be portable.

Does this result mean that practically nobody uses Zip archive with
exotic letters in paths on that platform?  I am not talking about
developers and savvy people who know where to download third-party
Zip archivers and how to install them.  I am imagining a grandma who
received an archive full of photos of her grandchild in her Outlook
Express or GMail inbox, clicked the attachment to download it, and
is trying to view the photo inside.

Not necessarily. Photos often have names like img_0123.jpg etc., which are handled just fine. And all family members probably use the same codepage on their computers, so they're less likely to run into this problem.

By the way, I found this bug asking for codepage support in unzip:

  https://bugs.launchpad.net/ubuntu/+source/unzip/+bug/580961

Multiple codepages seem to be used for ZIP files in the wild, none of them are supported by unzip on Linux, which only accepts ASCII or UTF-8.

René

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]