Re: [RFC] Implementing gitweb output caching - issues to solve

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jakub Narebski wrote:

> In my rewrite
>
>   [PATCHv6 17/24] gitweb: Show appropriate "Generating..." page when regenerating cache
>   http://thread.gmane.org/gmane.comp.version-control.git/163052/focus=163040
>   http://repo.or.cz/w/git/jnareb-git.git/commitdiff/48679f7985ccda16dc54fda97790841bab4a0ba2#patch1
>
> (see the browser_is_robot() subroutine:
>
>   http://repo.or.cz/w/git/jnareb-git.git/blob/48679f7985ccda16dc54fda97790841bab4a0ba2:/gitweb/gitweb.perl#l870
>
> I use HTTP::BrowserDetect package if available and it's ->robot() method.
>
> The fallback is to use *whitelist*, assuming that it would be better to
> not show "Generating..." page rather than download the wrong thing.
> I also guess that most (all?) web browsers use "Mozilla compatibile"
> somewhere in their User-Agent string, thus matching 'Mozilla'.

Interesting.  http://www.user-agents.org/ seems to suggest that many
robots do use Mozilla (though I don't think it's worth bending over
backwards to help them see the page correctly).

HTTP::BrowserDetect uses a blacklist as far as I can tell.  Maybe in
the long term it would be nice to add a whitelist ->human() method.

Cc-ing Olaf Alders for ideas.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]