Re: OT: help with search

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Rick!!!

Really, you're throwing this person to java!!!  You're a cruel dude!!!

But yeah, a web crawler is what you need. As stated, do a goog search,
there are plenty out there. You just need an easy to use crawler that
you can point a site at, and the crawler will then iterate through all
the pages on/in the domain, retrieving all the links for you.



On Mon, Nov 2, 2015 at 1:31 PM, Rick Stevens <ricks@xxxxxxxxxxxxxx> wrote:
> On 11/02/2015 10:22 AM, jd1008 wrote:
>>
>>
>>
>> On 11/02/2015 11:15 AM, bruce wrote:
>>>
>>> ok...
>>>
>>>
>>> so you have a 'local' site, not a page, and you want to extract/get
>>> all the links for the 'domain' of the site you're looking at.
>>>
>>> you're going to have to have an app/process that crawls the site, and
>>> generates the links.
>>>
>>> there are a bunch of open source stuff to allow you to craft a process
>>> to do this, depending on your skillset.  (not sure what your dev
>>> level/skillset is)
>>>
>>> you might also have 'plugins' for the browser that will more or less
>>> generate this kind of data.
>>>
>>> webscraping/crawling/links  <<< terms if you need them.
>>>
>>> let us know what else you need.
>>>
>>>
>>> On Mon, Nov 2, 2015 at 12:53 PM, jd1008 <jd1008@xxxxxxxxx> wrote:
>>>>
>>>>
>>>> On 11/01/2015 08:01 PM, bruce wrote:
>>>>>
>>>>> hey...
>>>>>
>>>>> is your issue, you have a specific site you can point to, and you want
>>>>> to get links off the site?
>>>>>
>>>>> or is it something else?
>>>>>
>>>>>
>>>>>
>>>>> On Sun, Nov 1, 2015 at 7:30 PM, jd1008 <jd1008@xxxxxxxxx> wrote:
>>>>>>
>>>>>> I googled for a way to list all items found on a single page.
>>>>>> What I am searching for is very very specific (in double quotes)
>>>>>> and only on a specific web site:
>>>>>> FOr example:
>>>>>>
>>>>>> my_Favorite_Site.com: "my specific phrase" -some_word
>>>>>>
>>>>>>
>>>>>> It comes up with a total of 12K hits on that web site.
>>>>>>
>>>>>> I need a way list the URL's of all the hits, or find a
>>>>>> way to easily capture the URL's of all hits without the
>>>>>> rigmarole of Rightclick on each link and copy url.
>>>>>>
>>>>>> Has anyone found a way to accomplish this?
>>>>>> --
>>>>>> users mailing list
>>>>>> users@xxxxxxxxxxxxxxxxxxxxxxx
>>>>>> To unsubscribe or change subscription options:
>>>>>> https://admin.fedoraproject.org/mailman/listinfo/users
>>>>>> Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
>>>>>> Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines
>>>>>> Have a question? Ask away: http://ask.fedoraproject.org
>>>>
>>>> Just the links, so I can put them in a text file for another program to
>>>> go through them.
>>>> --
>>>>
>> No.
>> Not  a local page. It can be any public search engine,
>> and it can be any specific phrase.
>> I already provided an example.
>> But the example does not give me just the raw texts of the links of the
>> hits found,
>> nor does it give all of them in one fell swoop which you could save to a
>> text file.
>
>
> Try this:
>
>         https://www.cs.cmu.edu/~rcm/websphinx/
>
> Perhaps that'll do what you want.
> ----------------------------------------------------------------------
> - Rick Stevens, Systems Engineer, AllDigital    ricks@xxxxxxxxxxxxxx -
> - AIM/Skype: therps2        ICQ: 226437340           Yahoo: origrps2 -
> -                                                                    -
> -         The world is coming to an end ... SAVE YOUR FILES!!!       -
> ----------------------------------------------------------------------
>
> --
> users mailing list
> users@xxxxxxxxxxxxxxxxxxxxxxx
> To unsubscribe or change subscription options:
> https://admin.fedoraproject.org/mailman/listinfo/users
> Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
> Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines
> Have a question? Ask away: http://ask.fedoraproject.org
-- 
users mailing list
users@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines
Have a question? Ask away: http://ask.fedoraproject.org



[Index of Archives]     [Older Fedora Users]     [Fedora Announce]     [Fedora Package Announce]     [EPEL Announce]     [EPEL Devel]     [Fedora Magazine]     [Fedora Summer Coding]     [Fedora Laptop]     [Fedora Cloud]     [Fedora Advisory Board]     [Fedora Education]     [Fedora Security]     [Fedora Scitech]     [Fedora Robotics]     [Fedora Infrastructure]     [Fedora Websites]     [Anaconda Devel]     [Fedora Devel Java]     [Fedora Desktop]     [Fedora Fonts]     [Fedora Marketing]     [Fedora Management Tools]     [Fedora Mentors]     [Fedora Package Review]     [Fedora R Devel]     [Fedora PHP Devel]     [Kickstart]     [Fedora Music]     [Fedora Packaging]     [Fedora SELinux]     [Fedora Legal]     [Fedora Kernel]     [Fedora OCaml]     [Coolkey]     [Virtualization Tools]     [ET Management Tools]     [Yum Users]     [Yosemite News]     [Gnome Users]     [KDE Users]     [Fedora Art]     [Fedora Docs]     [Fedora Sparc]     [Libvirt Users]     [Fedora ARM]

  Powered by Linux