On Thu, Sep 2, 2010 at 5:34 PM, Mike McGrath <mmcgrath@xxxxxxxxxx> wrote: > On Thu, 2 Sep 2010, Pascal Minnerup wrote: > >> Dear Fedora team, >> >> We on the Google Code Search project (www.google.com/codesearch) want to improve the quality of our index, and as part of that, would like to systematically crawl the fedora >> git repositories of fedoraproject.org, which we consider one of the major hosts of open source. Our crawlers use bandwidth throttling that should ensure that we don't >> overstress your web servers. >> >> 1. Is it okay for you if we systematically crawl your git repositories for new source code? >> >> 2. How would you recommend we get the repository directories? Our current approach would be to get the git repositories of recently updated packages from this page: >> http://pkgs.fedoraproject.org/gitweb/?o=age. >> >> 3. Are there any particular times or actions we should _avoid_? >> >> 4. Is there any particular person we should talk to in the future? >> >> An answer to these questions would be very helpful in improving the presence of Fedora code files in Code Search. We look forward to hearing from you. >> > > Thanks for contacting us, we really don't know how that would all react > but I'm ok with it provided we can contact you to change things later if > things do go south? > > -Mike Of course! We'll try to give you a heads-up the first time we crawl it, so if you do notice anything strange, you'll know who to blame! Thanks, Ben Ben St. John jbstjohn@xxxxxxxxxx Tel: +49 (0) 89 83 930-9054 Fax:+49 (0) 89 83 930-9001 Google Germany GmbH Dienerstr. 12 80331 München AG Hamburg, HRB 86891 | Sitz der Gesellschaft: Hamburg Geschäftsführer: Nikesh Arora, John Herlihy, Graham Law, Lloyd Martin, Kent Walker -- websites mailing list websites@xxxxxxxxxxxxxxxxxxxxxxx https://admin.fedoraproject.org/mailman/listinfo/websites