On Tue, Feb 12 2019, brian m. carlson wrote: > Gitweb has several hard-coded 40 values throughout it to check for > values that are passed in or acquired from Git. To simplify the code, > introduce a regex variable that matches either exactly 40 or exactly 64 > hex characters, and use this variable anywhere we would have previously > hard-coded a 40 in a regex. > > Similarly, switch the code that looks for deleted diffinfo information > to look for either 40 or 64 zeros, and update one piece of code to use > this function. Finally, when formatting a log line, allow an > abbreviated describe output to contain up to 64 characters. This might be going a bit overboard but I tried this with a variant where... > +# A regex matching a valid object ID. > +our $oid_regex = qr/(?:[0-9a-fA-F]{40}(?:[0-9a-fA-F]{24})?)/; > + Instead of this dense regex I did: my $sha1_len = 40; my $sha256_extra_len = 24; my $sha256_len = $sha1_len + $sha256_extra_len; sub oid_nlen_regex { my $len = shift; my $hchr = qr/[0-9a-fA-F]/; return qr/(?:(?:$hchr){$len})/ } our $oid_regex; { my $x = oid_nlen_regex($sha1_len); my $y = oid_nlen_regex($sha256_extra_len); $oid_regex = qr/(?:$x(?:$y)?)/ } Then most of the rest of this is the same, e.g.: > - if ($input =~ m/^[0-9a-fA-F]{40}$/) { But... > @@ -2037,10 +2040,10 @@ sub format_log_line_html { > (?<!-) # see strbuf_check_tag_ref(). Tags can't start with - > [A-Za-z0-9.-]+ > (?!\.) # refs can't end with ".", see check_refname_format() > - -g[0-9a-fA-F]{7,40} > + -g[0-9a-fA-F]{7,64} > | > # Just a normal looking Git SHA1 > - [0-9a-fA-F]{7,40} > + [0-9a-fA-F]{7,64} > ) > \b > }{ E.g. here we can do call oid_nlen_regex("7,64") to produce this blurb. > - if ($line =~ m/^index [0-9a-fA-F]{40},[0-9a-fA-F]{40}/) { > + if ($line =~ m/^index $oid_regex,$oid_regex/) { > - } elsif ($line =~ m/^index [0-9a-fA-F]{40}..[0-9a-fA-F]{40}/) { > + } elsif ($line =~ m/^index $oid_regex..$oid_regex/) { And here, maybe nobody cares, but we now implicitly accept mixed SHA-1 & SHA-256 input. Whereas we could have a helper on top of the above code like: sub oid_nlen_prefix_infix_regex { my $nlen = shift; my $prefix = shift; my $infix = shift; my $rx = oid_nlen_regex($nlen); return qr/^\Q$prefix\E$rx\Q$infix\E$rx$/; } And then e.g.: } elsif ($line =~ oid_nlen_prefix_infix_regex($sha1_len, "index ", "..") || $line =~ oid_nlen_prefix_infix_regex($sha256_len, "index ", "..")) { So only accept SHA1..SHA1 or SHA256..SHA256, not SHA1..SHA256 or SHA256..SHA1.