Re: [PATCH v6+ 2/7] archive --add-virtual-file: allow paths containing colons

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Jun 18, 2022 at 10:19:28PM +0200, Johannes Schindelin wrote:
> Hi Adam,
> 
> On Wed, 15 Jun 2022, Adam Dinwoodie wrote:
> 
> > On Wed, Jun 15, 2022 at 01:00:07PM -0700, Junio C Hamano wrote:
> > > Adam Dinwoodie <adam@xxxxxxxxxxxxx> writes:
> > >
> > > >> diff --git a/t/t5003-archive-zip.sh b/t/t5003-archive-zip.sh
> > > >> index d6027189e2..3992d08158 100755
> > > >> --- a/t/t5003-archive-zip.sh
> > > >> +++ b/t/t5003-archive-zip.sh
> > > >> @@ -207,13 +207,21 @@ check_zip with_untracked
> > > >>  check_added with_untracked untracked untracked
> > > >>
> > > >>  test_expect_success UNZIP 'git archive --format=zip --add-virtual-file' '
> > > >> +	if test_have_prereq FUNNYNAMES
> > > >> +	then
> > > >> +		PATHNAME="pathname with : colon"
> > > >> +	else
> > > >> +		PATHNAME="pathname without colon"
> > > >> +	fi &&
> > > >>  	git archive --format=zip >with_file_with_content.zip \
> > > >> +		--add-virtual-file=\""$PATHNAME"\": \
> > > >>  		--add-virtual-file=hello:world $EMPTY_TREE &&
> > > >>  	test_when_finished "rm -rf tmp-unpack" &&
> > > >>  	mkdir tmp-unpack && (
> > > >>  		cd tmp-unpack &&
> > > >>  		"$GIT_UNZIP" ../with_file_with_content.zip &&
> > > >>  		test_path_is_file hello &&
> > > >> +		test_path_is_file "$PATHNAME" &&
> > > >>  		test world = $(cat hello)
> > > >>  	)
> > > >>  '
> > > >
> > > > This test is currently failing on Cygwin: it looks like it's exposing a
> > > > bug in Cygwin that means files with colons in their name aren't
> > > > correctly extracted from zip archives.  I'm going to report that to the
> > > > Cygwin mailing list, but I wanted to note it for the record here, too.
> > >
> > > Does this mean that our code to set FUNNYNAMES prerequiste is
> > > slightly broken?  IOW, should we check with a path with a colon in
> > > it, as well as whatever we use currently for FUNNYNAMES?
> > >
> > > Something like the attached patch?
> > >
> > > Or does Cygwin otherwise work perfectly well with a path with a
> > > colon in it, but only $GIT_UNZIP command has problem with it?  If
> > > that is the case, then please disregard the attached.
> >
> > The latter: Cygwin works perfectly with paths containing colons, except
> > that Cygwin's `unzip` is seemingly buggy and doesn't work.  The file
> > systems Cygwin runs on don't support colons in paths, but Cygwin hides
> > that problem by rewriting ASCII colons to some high Unicode code point
> > on the filesystem,
> 
> Let me throw in a bit more detail: The forbidden characters are mapped
> into the Unicode page U+f0XX, which is supposed to be used "for private
> purposes". Even more detail can be found here:
> https://github.com/cygwin/cygwin/blob/cygwin-3_3_5-release/winsup/cygwin/strfuncs.cc#L19-L23
> 
> > meaning Cygwin-native applications see a regular colon, while
> > Windows-native applications see an unusual but perfectly valid Unicode
> > character.
> 
> Now, I have two questions:
> 
> - Why does `unzip` not use Cygwin's regular functions (which should all be
>   aware of that U+f0XX <-> U+00XX mapping)?

That is an excellent question!  This behaviour came from an `#ifdef
__CYGWIN__` in the upstream unzip package; with that #ifdef removed,
everything works as expected.  The folk on the Cygwin mailing list had
no idea *why* that #ifdef was there, given it's evidently unnecessary;
my best guess is that it was added a long time ago before Cygwin could
handle those characters in the general case.

Since my report, the Cygwin package has picked up a new maintainer who
has released a version of the unzip package with that #ifdef removed, so
this test is now passing.

> - Even more importantly: would the test case pass if we simply used
>   another forbidden character, such as `?` or `*`?

The set of characters that had special handling in unzip was "*:?|<> all
of which are handled appropriately by Cygwin applications in general,
and all of which had this unnecessary handling in `unzip`

> > I tested the same patch to FUNNYNAMES myself before reporting, and the
> > test fails exactly the same way.  If we wanted to catch this, I think
> > we'd need a test that explicitly attempted to unzip an archive
> > containing a path with a colon.
> >
> > (The code to set FUNNYNAMES *is* slightly broken, per the discussions
> > around 6d340dfaef ("t9902: split test to run on appropriate systems",
> > 2022-04-08), and my to-do list still features tidying up and
> > resubmitting the patch Ævar wrote in that discussion thread.  But it
> > wouldn't help here because this issue is specific to Cygwin's `unzip`,
> > rather than a general limitation of running on Cygwin.)
> 
> I'd rather avoid changing FUNNYNAMES at this stage, if we can help it.

Oh yes, I definitely wasn't proposing changing things for 2.37.0!  I
just wanted to acknowledge that there is a known issue here that has
been discussed on this list previously, that we (I) would hopefully get
around to fixing at some point.

Adam



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux