Re: BUG in git diff-index

Eric Wong <e@xxxxxxxxx> · Tue, 26 Sep 2017 20:11:12 +0000

Marc Herbert <Marc.Herbert@xxxxxxxxx> wrote:
> PS: I used NNTP and http://dir.gmane.org/gmane.comp.version-control.git
> to quickly find this old thread (what could we do without NNTP?). Then
> I googled for a web archive of this thread and Google could only find
> this one: http://git.661346.n2.nabble.com/BUG-in-git-diff-index-tt7652105.html#none
> Is there a robots.txt to block indexing on
> https://public-inbox.org/git/1459432667.2124.2.camel@xxxxxxx ?

There's no blocks on public-inbox.org and I'm completely against
any sort of blocking/throttling.  Maybe there's too many pages
to index?  Or the Message-IDs in URLs are too ugly/scary?  Not
sure what to do about that...

Anyways, I just put up a robots.txt with Crawl-Delay: 1, since I
seem to recall crawlers use a more conservative delay by default:

==> https://public-inbox.org/robots.txt <==
User-Agent: *
Crawl-Delay: 1

I don't know much about SEO other than keeping a site up and
responsive; so perhaps there's more to be done about getting
things indexed...