On Sat, Jun 18, 2022 at 10:19:28PM +0200, Johannes Schindelin wrote: > Hi Adam, > > On Wed, 15 Jun 2022, Adam Dinwoodie wrote: > > > On Wed, Jun 15, 2022 at 01:00:07PM -0700, Junio C Hamano wrote: > > > Adam Dinwoodie <adam@xxxxxxxxxxxxx> writes: > > > > > > >> diff --git a/t/t5003-archive-zip.sh b/t/t5003-archive-zip.sh > > > >> index d6027189e2..3992d08158 100755 > > > >> --- a/t/t5003-archive-zip.sh > > > >> +++ b/t/t5003-archive-zip.sh > > > >> @@ -207,13 +207,21 @@ check_zip with_untracked > > > >> check_added with_untracked untracked untracked > > > >> > > > >> test_expect_success UNZIP 'git archive --format=zip --add-virtual-file' ' > > > >> + if test_have_prereq FUNNYNAMES > > > >> + then > > > >> + PATHNAME="pathname with : colon" > > > >> + else > > > >> + PATHNAME="pathname without colon" > > > >> + fi && > > > >> git archive --format=zip >with_file_with_content.zip \ > > > >> + --add-virtual-file=\""$PATHNAME"\": \ > > > >> --add-virtual-file=hello:world $EMPTY_TREE && > > > >> test_when_finished "rm -rf tmp-unpack" && > > > >> mkdir tmp-unpack && ( > > > >> cd tmp-unpack && > > > >> "$GIT_UNZIP" ../with_file_with_content.zip && > > > >> test_path_is_file hello && > > > >> + test_path_is_file "$PATHNAME" && > > > >> test world = $(cat hello) > > > >> ) > > > >> ' > > > > > > > > This test is currently failing on Cygwin: it looks like it's exposing a > > > > bug in Cygwin that means files with colons in their name aren't > > > > correctly extracted from zip archives. I'm going to report that to the > > > > Cygwin mailing list, but I wanted to note it for the record here, too. > > > > > > Does this mean that our code to set FUNNYNAMES prerequiste is > > > slightly broken? IOW, should we check with a path with a colon in > > > it, as well as whatever we use currently for FUNNYNAMES? > > > > > > Something like the attached patch? > > > > > > Or does Cygwin otherwise work perfectly well with a path with a > > > colon in it, but only $GIT_UNZIP command has problem with it? If > > > that is the case, then please disregard the attached. > > > > The latter: Cygwin works perfectly with paths containing colons, except > > that Cygwin's `unzip` is seemingly buggy and doesn't work. The file > > systems Cygwin runs on don't support colons in paths, but Cygwin hides > > that problem by rewriting ASCII colons to some high Unicode code point > > on the filesystem, > > Let me throw in a bit more detail: The forbidden characters are mapped > into the Unicode page U+f0XX, which is supposed to be used "for private > purposes". Even more detail can be found here: > https://github.com/cygwin/cygwin/blob/cygwin-3_3_5-release/winsup/cygwin/strfuncs.cc#L19-L23 > > > meaning Cygwin-native applications see a regular colon, while > > Windows-native applications see an unusual but perfectly valid Unicode > > character. > > Now, I have two questions: > > - Why does `unzip` not use Cygwin's regular functions (which should all be > aware of that U+f0XX <-> U+00XX mapping)? That is an excellent question! This behaviour came from an `#ifdef __CYGWIN__` in the upstream unzip package; with that #ifdef removed, everything works as expected. The folk on the Cygwin mailing list had no idea *why* that #ifdef was there, given it's evidently unnecessary; my best guess is that it was added a long time ago before Cygwin could handle those characters in the general case. Since my report, the Cygwin package has picked up a new maintainer who has released a version of the unzip package with that #ifdef removed, so this test is now passing. > - Even more importantly: would the test case pass if we simply used > another forbidden character, such as `?` or `*`? The set of characters that had special handling in unzip was "*:?|<> all of which are handled appropriately by Cygwin applications in general, and all of which had this unnecessary handling in `unzip` > > I tested the same patch to FUNNYNAMES myself before reporting, and the > > test fails exactly the same way. If we wanted to catch this, I think > > we'd need a test that explicitly attempted to unzip an archive > > containing a path with a colon. > > > > (The code to set FUNNYNAMES *is* slightly broken, per the discussions > > around 6d340dfaef ("t9902: split test to run on appropriate systems", > > 2022-04-08), and my to-do list still features tidying up and > > resubmitting the patch Ævar wrote in that discussion thread. But it > > wouldn't help here because this issue is specific to Cygwin's `unzip`, > > rather than a general limitation of running on Cygwin.) > > I'd rather avoid changing FUNNYNAMES at this stage, if we can help it. Oh yes, I definitely wasn't proposing changing things for 2.37.0! I just wanted to acknowledge that there is a known issue here that has been discussed on this list previously, that we (I) would hopefully get around to fixing at some point. Adam