RE: How do I find products when a user types freeform strings like 'Sony 20" TV' or '20" Sony TV'? [SOLVED]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> -----Original Message-----
> From: Daevid Vincent [mailto:daevid@xxxxxxxxxx] 
> Sent: Friday, May 04, 2007 1:22 AM
> To: mysql@xxxxxxxxxxxxxxx
> Subject: How do I find products when a user types freeform 
> strings like 'Sony 20" TV' or '20" Sony TV'?
> 
> I'm having trouble figuring out the logic/query I want.
> I know that all those "OR"s are not right.
> I'm doing this in PHP and mySQL (of course), 
> so if it can't be done with a single query, I can split it up.
> 
> Here's the challenge, given a text field search box, someone enters:
> 
> 	Sony 20" TV
> 
> How do I search for that, not knowing which fields are which?
> For example, they could have also entered:
> 
> 	20" Sony TV
> 
> This is the one I have now, but (as you probably noticed), it 
> will return many rows,
> I expect that most of the time > 1 row will be returned, but 
> I'm getting a grip more than I want (or the customer would want), and
> also rows that have nothing to do with the search terms.
> 
> SELECT products.*, companies.name AS company_name, 
> categories.name AS category_name 
> FROM 	 products 
> 	 LEFT JOIN companies ON company_id = companies.id 
> 	 LEFT JOIN categories ON category_id = categories.id 
> WHERE  products.enabled = 1 
> 	AND( 
> 	    (products.model LIKE 'sony%'   OR products.model 
> LIKE '20%'  OR products.model LIKE 'tv%') 
> 	 OR (products.upc LIKE 'sony'      OR products.upc LIKE 
> '20'     OR products.upc LIKE 'tv') 
> 	 OR (products.name LIKE '%sony%'   OR products.name 
> LIKE '20%'   OR products.name LIKE '%tv%') 
> 	 OR (companies.name LIKE 'sony%'   OR companies.name 
> LIKE '20%'  OR companies.name LIKE 'tv%') 
> 	 OR (categories.name LIKE '%sony%' OR categories.name 
> LIKE '20%' OR categories.name LIKE '%tv%') 
> 	) 
> ORDER BY categories.name DESC, products.name ASC, companies.name ASC;
> 
> (and that just gets uglier the more words in the search)
> 
> +----+------------------+--------------+--------------+-------
> --------+
> | id | name             | model        | company_name | 
> category_name |
> +----+------------------+--------------+--------------+-------
> --------+
> |  1 | 20" TV           | STV20-KVR-HD | Sony         | Tube  
>         | <---
> |  2 | 36" TV           | STV36-KVR-HD | Sony         | Tube  
>         | 
> |  4 | Ultra-Plasma 62" | UP62F900     | Sony         | 
> Plasma        | 
> |  5 | Value Plasma 38" | VPR542_38    | Sony         | 
> Plasma        | 
> |  6 | Power-MP3 5gb    | 09834wuw34   | Sony         | MP3 
> Players   | 
> |  3 | Super-LCD 42"    | SLCD42hd002  | Sony         | LCD   
>         | 
> |  7 | Super-Player 1gb | SP1gb        | Sony         | Flash 
>         | 
> |  8 | Porta CD         | pcd500       | Sony         | CD 
> Players    | 
> ......
> +----+------------------+--------------+--------------+-------
> --------+
> 
> Obviously the person wanted id = 1 in this case.
> 
> Unrelated, is there any speed improvement using JOIN instead 
> of LEFT JOIN ?
> Think millions of products.
> 
> Thanks for help and suggestions...
> 
> Daevid.

I'll attach a .php file, but this list server may strip it off, so I'll also paste it below, sorry for any formatting issues in
advance... 

<?php
if ($_POST['keywords'])
{
	$_POST['keywords'] = stripslashes($_POST['keywords']);
	$words = preg_split("/\s+/",$_POST['keywords'], -1, PREG_SPLIT_NO_EMPTY);
}

$sql = 'SELECT 	products.* FROM product_table WHERE 1 ';
$sql .= keyword_filter($words, array('products.model%', 'products.upc', '%products.name%', 'companies.name%', '%categories.name%'),
true);
$sth = SQL_QUERY($sql);

/**
* Builds the WHERE portion of a SQL statement using the keywords in various columns with wildcard support.
* 
* @return 	string SQL statement fragment
* @param	mixed $words either a string of words space deliminated or an array of words
* @param	array $columns an array of table.column names to search the $words in. Use % as a wildcard for example pass in
'username%' or '%username%'.
* @param	boolean $and (true) whether the words have to be ANDed or ORed together.
* @author 	Daevid Vincent [daevid@xxxxxxxxxxx]
* @since 	1.0
* @version 	1.4
* @date    	05/10/07
* @todo		This should handle +, - and "" just like google or yahoo or other search engines do.
*/
function keyword_filter($words, $columns, $and = true)
{
	// this maybe useful
	// http://wiki.ittoolbox.com/index.php/Code:Translate_Boolean_Query_to_SQL_select_statement
	// http://www.ibiblio.org/adriane/queries/
	// http://www.zend.com/zend/tut/tutorial-ferrara1.php?article=tutorial-ferrara1&kind=t&id=8238&open=1&anc=0&view=1
	
	// http://evolt.org/article/Boolean_Fulltext_Searching_with_PHP_and_MySQL/18/15665/index.html
	// http://www.databasejournal.com/features/mysql/article.php/3512461
	
	// this would be great, but the dumb-asses don't work with InnoDB tables. GRRR!
	// http://dev.mysql.com/doc/refman/5.0/en/fulltext-boolean.html
	//$sql .= " AND MATCH (".implode(',',$columns).") AGAINST ('".implode(' ',$words)."' IN BOOLEAN MODE)";
	
	if (!is_array($columns) or !$words) return;
	
	if (is_string($words))
		$words = preg_split("/\s+/",$words, -1, PREG_SPLIT_NO_EMPTY);
	
	if(count($words) < 1) return '';
	
	if ($and) //AND the words together
	{
		$sql = " AND ";
		$sqlArray = array();
		foreach($words as $word)
		{
			$tmp = array();
			foreach($columns as $field)
			{
				$col = str_replace('%','',$field);
				//[dv] read the http://php.net/preg_replace carefully. You must use this format, 
				//	   because otherwise $words that are digits will cause undesired results.
				$myword = preg_replace("/(%)?([\w\.]+)(%)?/", "\${1}".$word."\${3}", $field );
				$tmp[] = $col." LIKE '".SQL_ESCAPE($myword)."'";
			}
			$sqlArray[] = " (".implode(" OR ",$tmp).") ";
		}
		$sql .= implode(" AND ", $sqlArray);
	}
	else //OR the words together
	{
		$sql = " AND ( ";
		$sqlArray = array();
		foreach($columns as $field)
		{
			$col = str_replace('%','',$field);
			
			$tmp = array();
			foreach($words as $word)
			{
				//[dv] read the http://php.net/preg_replace carefully. You must use this format, 
				//	   because otherwise $words that are digits will cause undesired results.
				$myword = preg_replace("/(%)?([\w\.]+)(%)?/", "\${1}".$word."\${3}", $field );
				$tmp[] = $col." LIKE '".SQL_ESCAPE($myword)."'";
			}
			$sqlArray[] = "(".implode(" OR ",$tmp).") ";
		}
		$sql .= implode(" OR ", $sqlArray);
		$sql .= ") ";
	}
	
	return $sql;
}
?>

-- 
PHP Database Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

[Index of Archives]     [PHP Home]     [PHP Users]     [Postgresql Discussion]     [Kernel Newbies]     [Postgresql]     [Yosemite News]

  Powered by Linux