On Mon, 2015-02-16 at 00:00 +0300, hadi wrote: > > > -----Original Message----- > > From: Ashley Sheridan [mailto:ash@xxxxxxxxxxxxxxxxxxxx] > > Sent: Sunday, February 15, 2015 11:22 PM > > To: hadi > > Cc: php-general@xxxxxxxxxxxxx > > Subject: Re: duplicate image > > > > On Sun, 2015-02-15 at 23:08 +0300, hadi wrote: > > > > > > > -----Original Message----- > > > > From: Ashley Sheridan [mailto:ash@xxxxxxxxxxxxxxxxxxxx] > > > > Sent: Sunday, February 15, 2015 4:07 PM > > > > To: hadi; php-general@xxxxxxxxxxxxx > > > > Subject: RE: duplicate image > > > > > > > > > > > > > > > > On 15 February 2015 12:51:35 GMT+00:00, hadi > > > > <almarzuki2011@xxxxxxxxxxx> wrote: > > > > > > > > > > > > > > >> -----Original Message----- > > > > >> From: Ashley Sheridan [mailto:ash@xxxxxxxxxxxxxxxxxxxx] > > > > >> Sent: Sunday, February 15, 2015 1:43 PM > > > > >> To: hadi; php-general@xxxxxxxxxxxxx > > > > >> Subject: Re: duplicate image > > > > >> > > > > >> > > > > >> > > > > >> On 15 February 2015 10:24:23 GMT+00:00, hadi > > > > >> <almarzuki2011@xxxxxxxxxxx> wrote: > > > > >> >Hi, > > > > >> > > > > > >> >I have script which download rssfeed from the internet. But > > > > >> >unfortunately it keep downloading duplicate image to the database. > > > > >> > > > > > >> >Here is my script > > > > >> > > > > > >> ><?php > > > > >> >require 'database.php'; > > > > >> > > > > > >> >$url = "http://www.albaldnews.com/rss.php?cat=24"; > > > > >> >$rss = simplexml_load_file($url); > > > > >> > > > > > >> >if($rss) > > > > >> >{ > > > > >> >echo '<h1>'.$rss->channel->title.'</h1>'; > > > > >> >echo '<li>'.$rss->channel->pubDate.'</li>'; > > > > >> >$items = $rss->channel->item; > > > > >> >foreach($items as $item) > > > > >> >{ > > > > >> > > > > > >> >$title = $item->title; > > > > >> >$link = $item->link; > > > > >> >$published_on = $item->pubDate; > > > > >> >$description = $item->description; $category = $item->category; > > > > >> >$guid = $item->guid; $enclosure = $item->enclosure[0]['url']; > > > > >> > > > > > >> > > > > > >> >$ch = curl_init ("$item->enclosure"); curl_setopt($ch, > > > > >> >CURLOPT_HEADER, 0); curl_setopt($ch, > > CURLOPT_RETURNTRANSFER, > > > > 1); > > > > >> >curl_setopt($ch, CURLOPT_BINARYTRANSFER,1); > > $rawdata=curl_exec > > > > >> >($ch); curl_close > > > > >($ch); > > > > >> > > > > > >> > > > > > >> >mysqli_real_escape_string($conn,$item->title); > > > > >> >mysqli_real_escape_string($conn,$item->link); > > > > >> >mysqli_real_escape_string($conn,$item->pubDate); > > > > >> >mysqli_real_escape_string($conn,$item->description); > > > > >> >mysqli_real_escape_string($conn,$item->category); > > > > >> >$img=mysqli_real_escape_string($conn,$rawdata); > > > > >> > > > > > >> >$sql = "INSERT INTO feedtable > > > > >> >(title,link,pubdate,description,category,image)VALUES > > > > >> > > > > >>('$item->title','$item->link','$item->pubDate','$item->description > > > > >>','$ > > > > >>i > > > > >> >tem-> > > > > >> >category','$img')"; > > > > >> >$result = mysqli_query($conn, $sql); > > > > >> > > > > > >> >if ($conn->query($sql) === TRUE) { > > > > >> > echo "New record created successfully\n"; } else { > > > > >> > echo "Error:\n " . $sql . "<br>" . $conn->error; } > > > > >> > > > > > >> >} > > > > >> >} > > > > >> > > > > > >> >?> > > > > >> > > > > >> You need to either set a unique index on that field in the DB and > > > > >deal with > > > > >> the warning (either at an error level or with an SQL construct > > > > >> like > > > > >INSERT ... > > > > >> ON DUPLICATE KEY UPDATE) or query that table first to see if what > > > > >you're > > > > >> inputting is unique. > > > > >> > > > > >> As you're doing this with binary data, I would recommended > > > > >> generating > > > > >a > > > > >> hash of the image and comparing that, as MySQL might have > > > > >> problems > > > > >doing > > > > >> comparisons of such large objects, and it won't be quick. > > > > > > > > > > > > > > >Ash, > > > > > > > > > >Look what I have done, > > > > > > > > > ><?php > > > > >require 'database.php'; > > > > > > > > > >$url = "http://www.albaldnews.com/rss.php?cat=24"; > > > > >$rss = simplexml_load_file($url); > > > > > > > > > >if($rss) > > > > >{ > > > > >echo '<h1>'.$rss->channel->title.'</h1>'; > > > > >echo '<li>'.$rss->channel->pubDate.'</li>'; > > > > >$items = $rss->channel->item; > > > > >foreach($items as $item) > > > > >{ > > > > > > > > > >$link = $item->link; > > > > > > > > > > > > > > > > > > > > > > > > >mysqli_real_escape_string($conn,$item->link); > > > > > > > > > >$query1 = "SELECT * from feedtable "; $result1= mysqli_query($conn, > > > > >$query1); > > > > > > > > > >if ( mysqli_num_rows ($result1 ) > 0 ) { echo "duplicate entry\n"; > > > > >} > > > > > > > > > >else > > > > >{ > > > > >$sql = "INSERT INTO feedtable (link)VALUES ('$item->link')"; > > > > >$result = mysqli_query($conn, $sql); > > > > > > > > > >if ($sql == true) > > > > >{ > > > > >echo "link added\n"; > > > > >} > > > > >} > > > > > > > > > >} > > > > >} > > > > >?> > > > > > > > > > >But only inserting one link from many links I done know why that’s > > > > >happen its suppose to insert all the links from the feeds. And if > > > > >found duplicate well "echo" otherwise well insert all the link. > > > > >The links are different from each other no duplicate on them. > > > > > > > > No, you're checking to see if $sql equates to true. In your code > > > > $sql is a string, and is always true. > > > > > > > > In the case of only one entry being added, have you at least checked > > > > to see what queries are being run? At a basic level, outputting the > > > > sql you're running would help. > > > > > > Hi Ash, > > > > > > I correct my previews code. It was wrong. In my new code im ably discard > > duplicate entry to the database. But the image It keep adding to the > > database even if there's duplicate entry in the database. > > > Please see my code and let me know what im missing. > > > > > > > > > <?php > > > require 'database.php'; > > > > > > $url = "http://www.albaldnews.com/rss.php?cat=24"; > > > $rss = simplexml_load_file($url); > > > > > > if($rss) > > > { > > > echo '<h1>'.$rss->channel->title.'</h1>'; > > > echo '<li>'.$rss->channel->pubDate.'</li>'; > > > $items = $rss->channel->item; > > > foreach($items as $item) > > > { > > > > > > > > > $enclosure = $item->enclosure[0]['url']; > > > > > > $ch = curl_init ("$enclosure"); > > > curl_setopt($ch, CURLOPT_HEADER, 0); > > > curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($ch, > > > CURLOPT_BINARYTRANSFER,1); $rawdata=curl_exec ($ch); curl_close > > ($ch); > > > > > > > > > $img=mysqli_real_escape_string($conn,$rawdata); > > > > > > > > > > > > > > > > > > $query = "SELECT image from feedtable where link = '$img'"; > > > $result= mysqli_query($conn, $query); > > > > > > $num_rows = mysqli_num_rows($result); > > > > > > if ($num_rows == 0) > > > { > > > > > > $query1 = "INSERT INTO feedtable (image)VALUES ('$img')"; > > > $result1= mysqli_query($conn, $query1); > > > > > > echo "image added\n"; > > > > > > } > > > > > > else > > > { > > > echo "duplicate entry\n"; > > > > > > } > > > > > > } > > > } > > > > > > ?> > > > > > > > > > > I'm assuming from the logic you have in your queries that your `image` > > table only contains a single field called `link` (which is a misnomer as > > you're storing the image as a binary blog, and not the link to the > > image.) > > > > As I mentioned earlier, comparing binary blobs like this in MySQL is not > > a good idea, so it's little wonder you're running into issues. > > > > Have a look at generating a hash of the image file and storing that > > also, and then run a comparison on that. A hash isn't guaranteed 100% to > > be unique, but the chances of you having two images from that RSS feed > > with the same hash is extremely small as to be unique. > > Ash, > > My code working fine. It was syntax mistake, that I didn’t see it. > > > $query = "SELECT image from feedtable where link = '$img'"; > > Instead of link I give it image, that’s was the problem. > > Thank you for your support.:) > > Ok, it might be working "fine", but you will notice a dramatic speed increase by moving to comparing the image hashes instead of the raw image data. Also, this line: $ch = curl_init ("$enclosure"); Doesn't need the quotes, as $enclosure is already a string. It's considered bad practice to do this when it's not necessary. -- Thanks, Ash http://www.ashleysheridan.co.uk -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php