On 25 November 2010 11:32, Deva <devendra.in@xxxxxxxxx> wrote: > Use curl > http://php.net/manual/en/book.curl.php > > > On Thu, Nov 25, 2010 at 4:41 PM, Shreyas Agasthya <shreyasbr@xxxxxxxxx>wrote: > >> I feel you should use more of the 4th method here as you are not trying to >> read the file but the header level Â(7th layer) information of the HTTP >> protocol. >> >> http://php.net/manual/en/function.file-get-contents.php >> >> >> --Shreyas >> >> On Thu, Nov 25, 2010 at 4:11 PM, Ron Piggott < >> ron.piggott@xxxxxxxxxxxxxxxxxx >> > wrote: >> >> > Â Will the header pass with using file_get_contents , or should I be >> using >> > another command, and if so, which one? ÂRon >> > >> > <?php >> > >> > Â Â header('User Agent: RonBot (http://www.example.com)'); >> > Â Â $url = "http://www.example.com"; <http://www.example.com%22;> >> > >> > Â Â Â Â $input = file_get_contents($url); >> > >> > >> > >> > The Verse of the Day >> > âEncouragement from Godâs Wordâ >> > http://www.TheVerseOfTheDay.info >> > >> > Â*From:* Shreyas Agasthya <shreyasbr@xxxxxxxxx> >> > *Sent:* Thursday, November 25, 2010 4:21 AM >> > *To:* Ron Piggott <ron.piggott@xxxxxxxxxxxxxxxxxx> >> > *Cc:* php-general@xxxxxxxxxxxxx ; ash@xxxxxxxxxxxxxxxxxxxx >> > *Subject:* Re: Fw: Spoofing user_agent >> > >> > A standard HTTP Request headers is : User Agent (without the underscore). >> > >> > --Shreyas >> > >> > On Thu, Nov 25, 2010 at 2:36 PM, Ron Piggott < >> > ron.piggott@xxxxxxxxxxxxxxxxxx> wrote: >> > >> >> >> >> Is this what you are telling me to do: >> >> >> >> header('user_agent: RonBot (http://www.theverseoftheday.info)'); >> >> >> >> Ron >> >> >> >> The Verse of the Day >> >> âEncouragement from Godâs Wordâ >> >> http://www.TheVerseOfTheDay.info >> >> >> >> From: ash@xxxxxxxxxxxxxxxxxxxx >> >> Sent: Thursday, November 25, 2010 3:34 AM >> >> To: Ron Piggott ; php-general@xxxxxxxxxxxxx >> >> Subject: Re: Fw: Spoofing user_agent >> >> >> >> You need to set it in the header request you make. Putting it in the >> >> script you're using as a spider with ini_set won't do anything because >> the >> >> Target site doesn't know anything about it. >> >> >> >> Thanks, >> >> Ash >> >> http://www.ashleysheridan.co.uk >> >> >> >> ----- Reply message ----- >> >> From: "Ron Piggott" <ron.piggott@xxxxxxxxxxxxxxxxxx> >> >> Date: Thu, Nov 25, 2010 08:25 >> >> Subject: Fw: Spoofing user_agent >> >> To: <php-general@xxxxxxxxxxxxx> >> >> >> >> I have wrote a script to generate a sitemap of my web site. ÂIt crawls >> all >> >> of the site web pages. Â(About 30,000) >> >> >> >> I need help to spoof the user_agent variable so the stats program >> running >> >> in the background ( âAWSTATSâ ) will treat the crawl as a bot, not >> browsing >> >> usage. >> >> >> >> The sitemap generator is a cron job. ÂI tried the syntax: >> >> ini_set('user_agent', 'RonBot (http://www.theverseoftheday.info)/'/); >> >> >> >> This didnât work. ÂThe browsing was attributed to the dedicated IP >> >> address. >> >> >> >> How do I get AWSTATS to access this, such as other entries under the >> >> âRobots/Spiders visitorsâ heading: >> >> Unknown robot (identified by 'bot*') >> >> >> >> I donât mean any ill will by changing this setting. ÂThanks for the >> help. >> >> >> >> Ron >> >> >> >> The Verse of the Day >> >> âEncouragement from Godâs Wordâ >> >> http://www.TheVerseOfTheDay.info >> >> >> >> >> > >> > >> > -- >> > Regards, >> > Shreyas Agasthya >> > >> >> >> >> -- >> Regards, >> Shreyas Agasthya >> > > > > -- > :DJ > It is no use using header(). This sets a header for the client, not the server of any file_get_contents() requests. I use stream_contexts. $s_Contents = file_get_contents( $s_URL, False, stream_context_create( array( 'http' => array( 'method' => 'GET', 'header' => "User-Agent: RonBot (http://www.example.com)\r\n" ), ) ) ); You can supply cookies, or anything else, with the request. Make sure you add a \r\n to each of the headers and just concatenate them. If you are doing this in a loop, then I'd recommend creating a default stream context and then the request would just be ... $s_Contents = file_get_contents($s_URL); As the default stream context would be applied. I had to use a default stream context to route all http requests through an NTLM authentication proxy server because PHP doesn't deal with NTLM authentication. See my user notes on http://docs.php.net/manual/en/function.stream-context-get-default.php. Don't bother with the link at the bottom of the user note- it's not live. Richard. -- Richard Quadling Twitter : EE : Zend @RQuadling : e-e.com/M_248814.html : bit.ly/9O8vFY -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php