Re: OT: help with search

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/02/2015 10:48 AM, bruce wrote:
Rick!!!

Really, you're throwing this person to java!!!  You're a cruel dude!!!

Muah-hah-hah! (twirling moustache and cackling evilly!)

But yeah, a web crawler is what you need. As stated, do a goog search,
there are plenty out there. You just need an easy to use crawler that
you can point a site at, and the crawler will then iterate through all
the pages on/in the domain, retrieving all the links for you.

I sorta chose that one as it has a fairly simple "download and use"
thing going for it.

On Mon, Nov 2, 2015 at 1:31 PM, Rick Stevens <ricks@xxxxxxxxxxxxxx> wrote:
On 11/02/2015 10:22 AM, jd1008 wrote:



On 11/02/2015 11:15 AM, bruce wrote:

ok...


so you have a 'local' site, not a page, and you want to extract/get
all the links for the 'domain' of the site you're looking at.

you're going to have to have an app/process that crawls the site, and
generates the links.

there are a bunch of open source stuff to allow you to craft a process
to do this, depending on your skillset.  (not sure what your dev
level/skillset is)

you might also have 'plugins' for the browser that will more or less
generate this kind of data.

webscraping/crawling/links  <<< terms if you need them.

let us know what else you need.


On Mon, Nov 2, 2015 at 12:53 PM, jd1008 <jd1008@xxxxxxxxx> wrote:


On 11/01/2015 08:01 PM, bruce wrote:

hey...

is your issue, you have a specific site you can point to, and you want
to get links off the site?

or is it something else?



On Sun, Nov 1, 2015 at 7:30 PM, jd1008 <jd1008@xxxxxxxxx> wrote:

I googled for a way to list all items found on a single page.
What I am searching for is very very specific (in double quotes)
and only on a specific web site:
FOr example:

my_Favorite_Site.com: "my specific phrase" -some_word


It comes up with a total of 12K hits on that web site.

I need a way list the URL's of all the hits, or find a
way to easily capture the URL's of all hits without the
rigmarole of Rightclick on each link and copy url.

Has anyone found a way to accomplish this?
--
users mailing list
users@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines
Have a question? Ask away: http://ask.fedoraproject.org

Just the links, so I can put them in a text file for another program to
go through them.
--

No.
Not  a local page. It can be any public search engine,
and it can be any specific phrase.
I already provided an example.
But the example does not give me just the raw texts of the links of the
hits found,
nor does it give all of them in one fell swoop which you could save to a
text file.


Try this:

         https://www.cs.cmu.edu/~rcm/websphinx/

Perhaps that'll do what you want.
----------------------------------------------------------------------
- Rick Stevens, Systems Engineer, AllDigital    ricks@xxxxxxxxxxxxxx -
- AIM/Skype: therps2        ICQ: 226437340           Yahoo: origrps2 -
-                                                                    -
-         The world is coming to an end ... SAVE YOUR FILES!!!       -
----------------------------------------------------------------------

--
users mailing list
users@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines
Have a question? Ask away: http://ask.fedoraproject.org


--
----------------------------------------------------------------------
- Rick Stevens, Systems Engineer, AllDigital    ricks@xxxxxxxxxxxxxx -
- AIM/Skype: therps2        ICQ: 226437340           Yahoo: origrps2 -
-                                                                    -
-   "Do you suffer from long-term memory loss?"  "I don't remember"  -
-                            -- Chumbawumba, "Amnesia" (TubThumping) -
----------------------------------------------------------------------
--
users mailing list
users@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines
Have a question? Ask away: http://ask.fedoraproject.org



[Index of Archives]     [Older Fedora Users]     [Fedora Announce]     [Fedora Package Announce]     [EPEL Announce]     [EPEL Devel]     [Fedora Magazine]     [Fedora Summer Coding]     [Fedora Laptop]     [Fedora Cloud]     [Fedora Advisory Board]     [Fedora Education]     [Fedora Security]     [Fedora Scitech]     [Fedora Robotics]     [Fedora Infrastructure]     [Fedora Websites]     [Anaconda Devel]     [Fedora Devel Java]     [Fedora Desktop]     [Fedora Fonts]     [Fedora Marketing]     [Fedora Management Tools]     [Fedora Mentors]     [Fedora Package Review]     [Fedora R Devel]     [Fedora PHP Devel]     [Kickstart]     [Fedora Music]     [Fedora Packaging]     [Fedora SELinux]     [Fedora Legal]     [Fedora Kernel]     [Fedora OCaml]     [Coolkey]     [Virtualization Tools]     [ET Management Tools]     [Yum Users]     [Yosemite News]     [Gnome Users]     [KDE Users]     [Fedora Art]     [Fedora Docs]     [Fedora Sparc]     [Libvirt Users]     [Fedora ARM]

  Powered by Linux