Re: MM2 crawler crash solved

Kevin Fenzi <kevin@xxxxxxxxx> · Fri, 10 Apr 2015 10:05:27 -0600

On Fri, 10 Apr 2015 14:41:22 +0200
Adrian Reber <adrian@xxxxxxxx> wrote:

> While trying to recreate the mm2_crawler crash without the
> MirrorManager database as backend I discovered that the crawler
> mainly uses python's httplib to do all the HEAD requests. For
> repomd.xml file, which are actually downloaded, the crawler switches
> to urlgrabber. Which seems to be problematic in threaded
> applications. Or in combination with httplib. Or something.

Ah. Great detective work! 

> The easiest solution seems to be to rewrite the single
> urlgrabber.urlread() to use one of the other available methods.
> 
> So a question to the python experts. Which implementation is the
> "best" to download a single repomd.xml via either http or ftp?
> 
> I would replace it with urllib2. Is that the correct replacement?

I would think that or python-requests? Not sure...

kevin
Attachment:
pgp0QoVwv2PW7.pgp

Description: OpenPGP digital signature
_______________________________________________
infrastructure mailing list
infrastructure@xxxxxxxxxxxxxxxxxxxxxxx
https://admin.fedoraproject.org/mailman/listinfo/infrastructure