Use curl http://php.net/manual/en/book.curl.php On Thu, Nov 25, 2010 at 4:41 PM, Shreyas Agasthya <shreyasbr@xxxxxxxxx>wrote: > I feel you should use more of the 4th method here as you are not trying to > read the file but the header level (7th layer) information of the HTTP > protocol. > > http://php.net/manual/en/function.file-get-contents.php > > > --Shreyas > > On Thu, Nov 25, 2010 at 4:11 PM, Ron Piggott < > ron.piggott@xxxxxxxxxxxxxxxxxx > > wrote: > > > Will the header pass with using file_get_contents , or should I be > using > > another command, and if so, which one? Ron > > > > <?php > > > > header('User Agent: RonBot (http://www.example.com)'); > > $url = "http://www.example.com"; <http://www.example.com%22;> > > > > $input = file_get_contents($url); > > > > > > > > The Verse of the Day > > “Encouragement from God’s Word” > > http://www.TheVerseOfTheDay.info > > > > *From:* Shreyas Agasthya <shreyasbr@xxxxxxxxx> > > *Sent:* Thursday, November 25, 2010 4:21 AM > > *To:* Ron Piggott <ron.piggott@xxxxxxxxxxxxxxxxxx> > > *Cc:* php-general@xxxxxxxxxxxxx ; ash@xxxxxxxxxxxxxxxxxxxx > > *Subject:* Re: Fw: Spoofing user_agent > > > > A standard HTTP Request headers is : User Agent (without the underscore). > > > > --Shreyas > > > > On Thu, Nov 25, 2010 at 2:36 PM, Ron Piggott < > > ron.piggott@xxxxxxxxxxxxxxxxxx> wrote: > > > >> > >> Is this what you are telling me to do: > >> > >> header('user_agent: RonBot (http://www.theverseoftheday.info)'); > >> > >> Ron > >> > >> The Verse of the Day > >> “Encouragement from God’s Word” > >> http://www.TheVerseOfTheDay.info > >> > >> From: ash@xxxxxxxxxxxxxxxxxxxx > >> Sent: Thursday, November 25, 2010 3:34 AM > >> To: Ron Piggott ; php-general@xxxxxxxxxxxxx > >> Subject: Re: Fw: Spoofing user_agent > >> > >> You need to set it in the header request you make. Putting it in the > >> script you're using as a spider with ini_set won't do anything because > the > >> Target site doesn't know anything about it. > >> > >> Thanks, > >> Ash > >> http://www.ashleysheridan.co.uk > >> > >> ----- Reply message ----- > >> From: "Ron Piggott" <ron.piggott@xxxxxxxxxxxxxxxxxx> > >> Date: Thu, Nov 25, 2010 08:25 > >> Subject: Fw: Spoofing user_agent > >> To: <php-general@xxxxxxxxxxxxx> > >> > >> I have wrote a script to generate a sitemap of my web site. It crawls > all > >> of the site web pages. (About 30,000) > >> > >> I need help to spoof the user_agent variable so the stats program > running > >> in the background ( “AWSTATS” ) will treat the crawl as a bot, not > browsing > >> usage. > >> > >> The sitemap generator is a cron job. I tried the syntax: > >> ini_set('user_agent', 'RonBot (http://www.theverseoftheday.info)/'/); > >> > >> This didn’t work. The browsing was attributed to the dedicated IP > >> address. > >> > >> How do I get AWSTATS to access this, such as other entries under the > >> “Robots/Spiders visitors” heading: > >> Unknown robot (identified by 'bot*') > >> > >> I don’t mean any ill will by changing this setting. Thanks for the > help. > >> > >> Ron > >> > >> The Verse of the Day > >> “Encouragement from God’s Word” > >> http://www.TheVerseOfTheDay.info > >> > >> > > > > > > -- > > Regards, > > Shreyas Agasthya > > > > > > -- > Regards, > Shreyas Agasthya > -- :DJ