>> + my($user_agent) = $ENV{'HTTP_USER_AGENT'}; > > What if $ENV{'HTTP_USER_AGENT'} is unset / undef, e.g. because we are > runing gitweb as a script... which includes running gitweb tests? It can be disabled for the running of tests, but the default is to show 'Generating...' vs. not. I'd rather assume there's an intelligent client on the other end and give users a reason why they aren't staring at their initial content immediately (and thus thinking something is broken). >> + >> + if( >> + # wget case >> + $user_agent =~ /^Wget/i >> + || >> + # curl should be excluded I think, probably better safe than sorry >> + $user_agent =~ /^curl/i >> + ){ >> + return 1; # True >> + } >> + >> + return 0; >> +} > > Compare (note: handcrafted solution is to whitelist, not blacklist): > > +sub browser_is_robot { > + return 1 if !exists $ENV{'HTTP_USER_AGENT'}; # gitweb run as script > + if (eval { require HTTP::BrowserDetect; }) { > + my $browser = HTTP::BrowserDetect->new(); > + return $browser->robot(); > + } > + # fallback on detecting known web browsers > + return 0 if ($ENV{'HTTP_USER_AGENT'} =~ /\b(?:Mozilla|Opera|Safari|IE)\b/); > + # be conservative; if not sure, assume non-interactive > + return 1; > +} My initial look indicated that perl-http-browserdetect wasn't available for RHEL / CentOS 5 - it is however available in EPEL. However there are a couple of things to note about User Agents at all: - They lie... a lot - Robots lie even more Blacklisting is still the better option, by a lot. I'll re-work this some in v9, as I'm fine with the added dependency. - John 'Warthog9' Hawley -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html