On Wed, Jun 12, 2024 at 9:23 AM Junio C Hamano <gitster@xxxxxxxxx> wrote: > > Suddenly t9500.70 has started failing for me in my local environment > (but $Corp IT folks control pretty much which version of base software > is installed and updated at what time, once I choose "I want to do CGI > in Perl" by selecting the libcgi-pm-perl module, I am not sure from > which version the thing was updated from. The Debian version claims > to be 4.62-1. I have the same version (I'm on the same $Corp IT setup). I downgraded to 4.61 and it succeeded. Upgraded to 4.62 again, and it started failing. The only difference in 4.62 besides a version number change in several files is: ``` diff -ur 4.61/usr/share/perl5/CGI.pm 4.62/usr/share/perl5/CGI.pm --- 4.61/usr/share/perl5/CGI.pm 2024-01-08 07:13:22.000000000 -0800 +++ 4.62/usr/share/perl5/CGI.pm 2024-03-01 05:43:03.000000000 -0800 @@ -1,13 +1,14 @@ package CGI; require 5.008001; use Carp 'croak'; +use URI; my $appease_cpants_kwalitee = q/ use strict; use warnings; #/; -$CGI::VERSION='4.61'; +$CGI::VERSION='4.62'; use CGI::Util qw(rearrange rearrange_header make_attributes unescape escape expires ebcdic2ascii ascii2ebcdic); @@ -2747,8 +2748,10 @@ $url .= $path if $path_info and defined $path; $url .= "?$query_str" if $query and $query_str ne ''; $url ||= ''; - $url =~ s/([^a-zA-Z0-9_.%;&?\/\\:+=~-])/sprintf("%%%02X",ord($1))/eg; - return $url; + + $url = URI->new( $url )->canonical; + $url =~ s!/$!!; + return $url } #### Method: cookie ``` I can confirm that backing that out fixes the issue. I also don't know enough perl to know what's wrong with that statement, but looking at the current head version, my guess is that it's the wrong type. The current head version (https://github.com/leejo/CGI.pm/blob/89c51a088db2a45b1c759e02c8d4772f5b6a36a9/lib/CGI.pm#L2752) has `$url = URI->new( $url )->canonical->as_string;`, applying that patch locally makes this work again. Running blame points us at https://github.com/leejo/CGI.pm/issues/263 ("->url being a object breaks everything"). One person even references this gitweb.perl breakage in the issue comments :) So my conclusion is that 4.62 is broken, and that newer versions are unbroken. > > It fails with path-info test, starting at t9500.70 with extra > warning in the log. This code > > if ($path_info) { > # $path_info has already been URL-decoded by the web server, but > # $my_url and $my_uri have not. URL-decode them so we can properly > # strip $path_info. > $my_url = unescape($my_url); > $my_uri = unescape($my_uri); > if ($my_url =~ s,\Q$path_info\E$,, && > $my_uri =~ s,\Q$path_info\E$,, && > defined $ENV{'SCRIPT_NAME'}) { > $base_url = $cgi->url(-base => 1) . $ENV{'SCRIPT_NAME'}; > } > } > > before it calls unescape(), I know $my_url is a > http://localhost/gitweb.cgi and after it calls unescape, it becomes > undefined. That will trigger a "Use of uninitialized value $my_url > in substitution (s///)" warning. > > unescape comes from CGI::Util because we do > > use CGI::Util qw(unescape); > > early in the program. > > As a workaround I locally have the attached patch to disable calling > CGI::Util::unescape implicitly as a sub, and instead make an > explicit call to it as a class method, and it seems to make the > tests pass. Please do not ask me why it works---the reason why I am > posting this message is to find somebody who can explain it to me ;-) I also don't know why this fixes the issue, unfortunately. > > The "unescape" thing in CGI::Util.pm begins with the standard > boilerplate that lets you call it as a plain-vanilla sub as well as > a class method, like so: > > # unescape URL-encoded data > sub unescape { > shift() if @_ > 0 and (ref($[0]) || (defined $[1] && $_[0] eq $CGI::DefaultClass)); > > but it seems that it has been that way since 2009, so it does not > explain why it started breaking for me all of sudden, even though > it _is_ curious that its counterpart in the same file, escape, > starts slightly differently to (presumably) achieve the same thing. > > sub escape { > # If we being called in an OO-context, discard the first argument. > shift() if @_ > 1 and ( ref($[0]) || (defined $[1] && $_[0] eq $CGI::DefaultClass)); > > Notice that the former does "shift" as long as there is even a > single argument, while the latter does so only when there are at > least two arguments. Both presumably would take a single argument, > the string to either escape or unescape, and the shift is presumably > to shift away the class object if they are called as class methods, > so the guard at the beginning of unscape looks suspect, but I am not > a Perl person, and as I said, it seems that the code has been that > way since 2009, so it is very likely that I am barking up a wrong > tree. > > Anyway. TIA for whoever explains the solution to this puzzle to me. > > > gitweb/gitweb.perl | 7 ++++++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git c/gitweb/gitweb.perl w/gitweb/gitweb.perl > index ccd14e0e30..a0a8b79ef4 100755 > --- c/gitweb/gitweb.perl > +++ w/gitweb/gitweb.perl > @@ -13,7 +13,7 @@ > # handle ACL in file access tests > use filetest 'access'; > use CGI qw(:standard :escapeHTML -nosticky); > -use CGI::Util qw(unescape); > +use CGI::Util qw(); > use CGI::Carp qw(fatalsToBrowser set_message); > use Encode; > use Fcntl ':mode'; > @@ -22,6 +22,11 @@ > use Time::HiRes qw(gettimeofday tv_interval); > use Digest::MD5 qw(md5_hex); > > +sub unescape { > + my $url = shift; > + return CGI::Util->unescape($url); > +} > + > binmode STDOUT, ':utf8'; > > if (!defined($CGI::VERSION) || $CGI::VERSION < 4.08) { >