waay off topic... help if you wish, toss if you don't.. if more than a few object.. would the moderator kill the thread please!

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi.

Got an issue. this is waaaay off topic. And I apologize. If more than
a few object, would the moderator please kill the thread. i wouldn't
have posted, but the list has been kind of "slow" lately, and.. well..
I have no tech/cool people to turn to!

I'm working on a crawling project. The overall project is geared
towards crawling a number of college sites (~400) to get class data,
as well as the required book data. The process targets the colleges,
does the fetch/parse, and stores the data into a mysql db. A similar
process occurs for the book data.

My issue. Please don't laugh. Make sure you're not drinking your
bourbon.. the crawl for the bookdata.. takes ~2-3 days.. running a
bunch of processes on a number of cheap digitalocean servers. The
process generates ~720K "sections" across the colleges (for the book
section/ISBN data).

I know there are people/resources who are "good" with this. I just
don't "know" any of them that I can talk with!

If you guys know of anyone ,or have any thoughts/ideas/etc.. I'd
appreciate the opportunity to discuss/chat/talk/etc..

And yeah, this process is really crude, but it more or less works.. If
I had clones of me, I'd implement queues, and test out other things to
speed up the overall processing time.

thanks for reading..

and again.. to the moderator.. if a few people object to this, feel
free to kill the thread!

thanks guys!
_______________________________________________
users mailing list -- users@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to users-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/users@xxxxxxxxxxxxxxxxxxxxxxx



[Index of Archives]     [Older Fedora Users]     [Fedora Announce]     [Fedora Package Announce]     [EPEL Announce]     [EPEL Devel]     [Fedora Magazine]     [Fedora Summer Coding]     [Fedora Laptop]     [Fedora Cloud]     [Fedora Advisory Board]     [Fedora Education]     [Fedora Security]     [Fedora Scitech]     [Fedora Robotics]     [Fedora Infrastructure]     [Fedora Websites]     [Anaconda Devel]     [Fedora Devel Java]     [Fedora Desktop]     [Fedora Fonts]     [Fedora Marketing]     [Fedora Management Tools]     [Fedora Mentors]     [Fedora Package Review]     [Fedora R Devel]     [Fedora PHP Devel]     [Kickstart]     [Fedora Music]     [Fedora Packaging]     [Fedora SELinux]     [Fedora Legal]     [Fedora Kernel]     [Fedora OCaml]     [Coolkey]     [Virtualization Tools]     [ET Management Tools]     [Yum Users]     [Yosemite News]     [Gnome Users]     [KDE Users]     [Fedora Art]     [Fedora Docs]     [Fedora Sparc]     [Libvirt Users]     [Fedora ARM]

  Powered by Linux