Re: waay off topic... help if you wish, toss if you don't.. if more than a few object.. would the moderator kill the thread please!

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



"Crawling"....is that the same as "Parsing"?....I have limited knowledge of databases in general....most of my exposure comes from SQL.

On Fri, Oct 26, 2018, 5:18 AM Frau Silvia Sánchez <lailahfsf@xxxxxxxxx> wrote:

Hi Bruce,

Sounds like an interesting problem to solve. And it sounds like something that might be useful for me too at some moment. Unfortunately, my knowledge of databases is about null.  But I'm always ready to learn and help, so I'd love to know more if that's okay for you.
As for knowing someone that is good at it, I'm sorry but I don't.

Kind regards,
Silvia



On Thu, 25 Oct 2018 at 22:00, bruce <badouglas@xxxxxxxxx> wrote:
Hi.

Got an issue. this is waaaay off topic. And I apologize. If more than
a few object, would the moderator please kill the thread. i wouldn't
have posted, but the list has been kind of "slow" lately, and.. well..
I have no tech/cool people to turn to!

I'm working on a crawling project. The overall project is geared
towards crawling a number of college sites (~400) to get class data,
as well as the required book data. The process targets the colleges,
does the fetch/parse, and stores the data into a mysql db. A similar
process occurs for the book data.

My issue. Please don't laugh. Make sure you're not drinking your
bourbon.. the crawl for the bookdata.. takes ~2-3 days.. running a
bunch of processes on a number of cheap digitalocean servers. The
process generates ~720K "sections" across the colleges (for the book
section/ISBN data).

I know there are people/resources who are "good" with this. I just
don't "know" any of them that I can talk with!

If you guys know of anyone ,or have any thoughts/ideas/etc.. I'd
appreciate the opportunity to discuss/chat/talk/etc..

And yeah, this process is really crude, but it more or less works.. If
I had clones of me, I'd implement queues, and test out other things to
speed up the overall processing time.

thanks for reading..

and again.. to the moderator.. if a few people object to this, feel
free to kill the thread!

thanks guys!
_______________________________________________
users mailing list -- users@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to users-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/users@xxxxxxxxxxxxxxxxxxxxxxx
_______________________________________________
users mailing list -- users@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to users-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/users@xxxxxxxxxxxxxxxxxxxxxxx
_______________________________________________
users mailing list -- users@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to users-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/users@xxxxxxxxxxxxxxxxxxxxxxx
[Index of Archives]     [Older Fedora Users]     [Fedora Announce]     [Fedora Package Announce]     [EPEL Announce]     [EPEL Devel]     [Fedora Magazine]     [Fedora Summer Coding]     [Fedora Laptop]     [Fedora Cloud]     [Fedora Advisory Board]     [Fedora Education]     [Fedora Security]     [Fedora Scitech]     [Fedora Robotics]     [Fedora Infrastructure]     [Fedora Websites]     [Anaconda Devel]     [Fedora Devel Java]     [Fedora Desktop]     [Fedora Fonts]     [Fedora Marketing]     [Fedora Management Tools]     [Fedora Mentors]     [Fedora Package Review]     [Fedora R Devel]     [Fedora PHP Devel]     [Kickstart]     [Fedora Music]     [Fedora Packaging]     [Fedora SELinux]     [Fedora Legal]     [Fedora Kernel]     [Fedora OCaml]     [Coolkey]     [Virtualization Tools]     [ET Management Tools]     [Yum Users]     [Yosemite News]     [Gnome Users]     [KDE Users]     [Fedora Art]     [Fedora Docs]     [Fedora Sparc]     [Libvirt Users]     [Fedora ARM]

  Powered by Linux