Re: duplicate image

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 2015-02-16 at 00:00 +0300, hadi wrote:
> 
> > -----Original Message-----
> > From: Ashley Sheridan [mailto:ash@xxxxxxxxxxxxxxxxxxxx]
> > Sent: Sunday, February 15, 2015 11:22 PM
> > To: hadi
> > Cc: php-general@xxxxxxxxxxxxx
> > Subject: Re:  duplicate image
> > 
> > On Sun, 2015-02-15 at 23:08 +0300, hadi wrote:
> > >
> > > > -----Original Message-----
> > > > From: Ashley Sheridan [mailto:ash@xxxxxxxxxxxxxxxxxxxx]
> > > > Sent: Sunday, February 15, 2015 4:07 PM
> > > > To: hadi; php-general@xxxxxxxxxxxxx
> > > > Subject: RE:  duplicate image
> > > >
> > > >
> > > >
> > > > On 15 February 2015 12:51:35 GMT+00:00, hadi
> > > > <almarzuki2011@xxxxxxxxxxx> wrote:
> > > > >
> > > > >
> > > > >> -----Original Message-----
> > > > >> From: Ashley Sheridan [mailto:ash@xxxxxxxxxxxxxxxxxxxx]
> > > > >> Sent: Sunday, February 15, 2015 1:43 PM
> > > > >> To: hadi; php-general@xxxxxxxxxxxxx
> > > > >> Subject: Re:  duplicate image
> > > > >>
> > > > >>
> > > > >>
> > > > >> On 15 February 2015 10:24:23 GMT+00:00, hadi
> > > > >> <almarzuki2011@xxxxxxxxxxx> wrote:
> > > > >> >Hi,
> > > > >> >
> > > > >> >I have script which download rssfeed from the internet. But
> > > > >> >unfortunately it keep downloading duplicate image to the database.
> > > > >> >
> > > > >> >Here is  my script
> > > > >> >
> > > > >> ><?php
> > > > >> >require 'database.php';
> > > > >> >
> > > > >> >$url = "http://www.albaldnews.com/rss.php?cat=24";;
> > > > >> >$rss = simplexml_load_file($url);
> > > > >> >
> > > > >> >if($rss)
> > > > >> >{
> > > > >> >echo '<h1>'.$rss->channel->title.'</h1>';
> > > > >> >echo '<li>'.$rss->channel->pubDate.'</li>';
> > > > >> >$items = $rss->channel->item;
> > > > >> >foreach($items as $item)
> > > > >> >{
> > > > >> >
> > > > >> >$title = $item->title;
> > > > >> >$link = $item->link;
> > > > >> >$published_on = $item->pubDate;
> > > > >> >$description = $item->description; $category = $item->category;
> > > > >> >$guid = $item->guid; $enclosure = $item->enclosure[0]['url'];
> > > > >> >
> > > > >> >
> > > > >> >$ch = curl_init ("$item->enclosure"); curl_setopt($ch,
> > > > >> >CURLOPT_HEADER, 0); curl_setopt($ch,
> > CURLOPT_RETURNTRANSFER,
> > > > 1);
> > > > >> >curl_setopt($ch, CURLOPT_BINARYTRANSFER,1);
> > $rawdata=curl_exec
> > > > >> >($ch); curl_close
> > > > >($ch);
> > > > >> >
> > > > >> >
> > > > >> >mysqli_real_escape_string($conn,$item->title);
> > > > >> >mysqli_real_escape_string($conn,$item->link);
> > > > >> >mysqli_real_escape_string($conn,$item->pubDate);
> > > > >> >mysqli_real_escape_string($conn,$item->description);
> > > > >> >mysqli_real_escape_string($conn,$item->category);
> > > > >> >$img=mysqli_real_escape_string($conn,$rawdata);
> > > > >> >
> > > > >> >$sql = "INSERT INTO feedtable
> > > > >> >(title,link,pubdate,description,category,image)VALUES
> > > > >>
> > > > >>('$item->title','$item->link','$item->pubDate','$item->description
> > > > >>','$
> > > > >>i
> > > > >> >tem->
> > > > >> >category','$img')";
> > > > >> >$result = mysqli_query($conn, $sql);
> > > > >> >
> > > > >> >if ($conn->query($sql) === TRUE) {
> > > > >> >    echo "New record created successfully\n"; } else {
> > > > >> >    echo "Error:\n " . $sql . "<br>" . $conn->error; }
> > > > >> >
> > > > >> >}
> > > > >> >}
> > > > >> >
> > > > >> >?>
> > > > >>
> > > > >> You need to either set a unique index on that field in the DB and
> > > > >deal with
> > > > >> the warning (either at an error level or with an SQL construct
> > > > >> like
> > > > >INSERT ...
> > > > >> ON DUPLICATE KEY UPDATE) or query that table first to see if what
> > > > >you're
> > > > >> inputting is unique.
> > > > >>
> > > > >> As you're doing this with binary data, I would recommended
> > > > >> generating
> > > > >a
> > > > >> hash of the image and comparing that, as MySQL might have
> > > > >> problems
> > > > >doing
> > > > >> comparisons of such large objects, and it won't be quick.
> > > > >
> > > > >
> > > > >Ash,
> > > > >
> > > > >Look what I have done,
> > > > >
> > > > ><?php
> > > > >require 'database.php';
> > > > >
> > > > >$url = "http://www.albaldnews.com/rss.php?cat=24";;
> > > > >$rss = simplexml_load_file($url);
> > > > >
> > > > >if($rss)
> > > > >{
> > > > >echo '<h1>'.$rss->channel->title.'</h1>';
> > > > >echo '<li>'.$rss->channel->pubDate.'</li>';
> > > > >$items = $rss->channel->item;
> > > > >foreach($items as $item)
> > > > >{
> > > > >
> > > > >$link = $item->link;
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >mysqli_real_escape_string($conn,$item->link);
> > > > >
> > > > >$query1 = "SELECT * from feedtable "; $result1= mysqli_query($conn,
> > > > >$query1);
> > > > >
> > > > >if ( mysqli_num_rows ($result1 ) > 0 ) { echo "duplicate entry\n";
> > > > >}
> > > > >
> > > > >else
> > > > >{
> > > > >$sql = "INSERT INTO feedtable (link)VALUES ('$item->link')";
> > > > >$result = mysqli_query($conn, $sql);
> > > > >
> > > > >if ($sql == true)
> > > > >{
> > > > >echo "link added\n";
> > > > >}
> > > > >}
> > > > >
> > > > >}
> > > > >}
> > > > >?>
> > > > >
> > > > >But only inserting one link from many links I done know why that’s
> > > > >happen its suppose to insert all the links from the feeds. And if
> > > > >found duplicate well "echo" otherwise well insert all the link.
> > > > >The links are different from each other no duplicate on them.
> > > >
> > > > No, you're checking to see if $sql equates to true. In your code
> > > > $sql is a string, and is always true.
> > > >
> > > > In the case of only one entry being added, have you at least checked
> > > > to see what queries are being run? At a basic level, outputting the
> > > > sql you're running would help.
> > >
> > > Hi Ash,
> > >
> > > I correct my previews code. It was wrong. In my new code im ably discard
> > duplicate entry to the database. But the image It keep adding to the
> > database even if there's duplicate entry in the database.
> > > Please see my code and let me know what im missing.
> > >
> > >
> > > <?php
> > > require 'database.php';
> > >
> > > $url = "http://www.albaldnews.com/rss.php?cat=24";;
> > > $rss = simplexml_load_file($url);
> > >
> > > if($rss)
> > > {
> > > echo '<h1>'.$rss->channel->title.'</h1>';
> > > echo '<li>'.$rss->channel->pubDate.'</li>';
> > > $items = $rss->channel->item;
> > > foreach($items as $item)
> > > {
> > >
> > >
> > > $enclosure = $item->enclosure[0]['url'];
> > >
> > > $ch = curl_init ("$enclosure");
> > > curl_setopt($ch, CURLOPT_HEADER, 0);
> > > curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($ch,
> > > CURLOPT_BINARYTRANSFER,1); $rawdata=curl_exec ($ch); curl_close
> > ($ch);
> > >
> > >
> > > $img=mysqli_real_escape_string($conn,$rawdata);
> > >
> > >
> > >
> > >
> > >
> > > $query = "SELECT image from feedtable where link = '$img'";
> > > $result= mysqli_query($conn, $query);
> > >
> > > $num_rows = mysqli_num_rows($result);
> > >
> > > if ($num_rows == 0)
> > > {
> > >
> > > $query1 = "INSERT INTO feedtable (image)VALUES ('$img')";
> > > $result1= mysqli_query($conn, $query1);
> > >
> > > echo "image added\n";
> > >
> > > }
> > >
> > > else
> > > {
> > > echo "duplicate entry\n";
> > >
> > > }
> > >
> > > }
> > > }
> > >
> > > ?>
> > >
> > >
> > 
> > I'm assuming from the logic you have in your queries that your `image`
> > table only contains a single field called `link` (which is a misnomer as
> > you're storing the image as a binary blog, and not the link to the
> > image.)
> > 
> > As I mentioned earlier, comparing binary blobs like this in MySQL is not
> > a good idea, so it's little wonder you're running into issues.
> > 
> > Have a look at generating a hash of the image file and storing that
> > also, and then run a comparison on that. A hash isn't guaranteed 100% to
> > be unique, but the chances of you having two images from that RSS feed
> > with the same hash is extremely small as to be unique.
> 
> Ash,
> 
> My code working fine. It was syntax mistake, that I didn’t see it. 
> > > $query = "SELECT image from feedtable where link = '$img'";
>  
> Instead of link I give it image, that’s was the problem.
> 
> Thank you for your support.:)
> 
> 

Ok, it might be working "fine", but you will notice a dramatic speed
increase by moving to comparing the image hashes instead of the raw
image data.

Also, this line:

$ch = curl_init ("$enclosure");

Doesn't need the quotes, as $enclosure is already a string. It's
considered bad practice to do this when it's not necessary.

-- 
Thanks,
Ash
http://www.ashleysheridan.co.uk




-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php





[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux