Search Postgresql Archives

Re: Could use some advice on search architecture

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



>On 04/19/2014 06:26 AM, Ron Pasch wrote:
>>  > - It should be possible to search for products and provide properties
>>  > that the product SHOULD have, not must have.
>>>
>>> I don't understand this.  Say you have a sprocket in red and green. Do you want to search for:
>>>
>>> select * from product where name = 'sprocket' and (color = 'red' or color = 'green')
>>>
>>> Or do you want something else?  Does the user say they'd "prefer" blue, but will take whatever else you have?
>>>
>>> Do you search for some properties exactly and some "preferred"?
>>>
>>> Perhaps you could describe a little more how you want to query the database?  Or, maybe, what your user's are searching for?
>>>
>>> -Andy
>>>
>>
>> Yes, the user can prefer certain properties and the products that match most of the properties should be in the top of the results, but if a product doesn't match all of them but just some of them, they should still be returned, but lower in the results.
>>
>> I'm seriously wondering if doing this solely with postgres is even possible without having long execution times. I've done some tests with 5 million records and just doing the "or" construction you mentioned above, which resulted in 600 to 900 ms queries and returning only those records of which all properties match at least one selected value.

 I don't think that OR clauses are the right way as the aim is to count how many attributes do match the search.
 basically a standard approach would look like:
 
	 SELECT pID, sum(match) as matches FROM
	 (
	 selct pID, 1 as match from Products where color ='pink'
	 UNION ALL
	 selct pID, 1 as match from Products where size ='XXL'
	 ...
	 )foo
	GROUP BY pID order by matches DESC
	
How many distinct attributes are involved ? ( 15 colors + 9 sizes + ....)

Marc

>>
>> I was thinking that perhaps using a search engine like lucene or sphinx would be more appropriate, but then I wonder what I would exactly be indexing and how I would be querying that, but that's a question for a different mailing list ;-)
>
>Please keep the list cc'd, so others can help as well.
>
>Yeah, doing a bunch of or's is gonna have to test all 5 million products.
>
>I wonder if there is a way we can treat this like a two step process.
>
>1) cut down the number of products
>
>2) sort them by #matches, popularity, etc
>
>You've talked about #2, but how about #1.  Is there any way to either include or exclude a product?  Users don't just ask for red, they ask for "tires (maybe red)".  Not all 5 million products are tires, right?
>
>-Andy


-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general





[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux