Search Postgresql Archives

Re: looking for a faster way to do that

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



You forgot to include the list ;)

On 26 Sep 2011, at 6:06, hamann.w@xxxxxxxxxxx wrote:

> Alban Hertroys wrote:
>>> 
>>> To me it sounds a little bit like you're comparing every item in a =
>>> warehouse to a set of descriptions to see what type of item it is, which =
>>> is something you would be much better off storing as a property of the =
>>> item. If an item is a fruit, store that it's a fruit!
>>> But I'm guessing at what you're trying to accomplish, so here's a few =
>>> other options...
>>> 
>>> I guess you could create 2781 expression indexes to do what you want (is =
>>> there a limit that prevents this?). Query planning would probably become =
>>> kind of slow and the indices will take up a considerable fraction of the =
>>> total table storage required - that's a pretty outlandish approach.
>>> 
>>> CREATE INDEX tbl_str_regex1 ON tbl (regexp_matches(str, 'string1'));
>>> CREATE INDEX tbl_str_regex1 ON tbl (regexp_matches(str, 'string2'));
>>> ...
>>> CREATE INDEX tbl_str_regex1 ON tbl (regexp_matches(str, 'string2781'));
> 
> Hi,
> 
> the strings are not really known before.
> Let me explain the scenario; there is one table about products, and code is the
> manufacturer's (or resellers') product id.
> So, if ABC were a maker of laptops, ABC123 and ABC456 might be two different machines,
> and ABC123G might have G3 mobile installed, or ABC123X might be the same thing
> with extra memory. Obviously these device variants all look the same.
> Now reseller sends us a collection of product images, so there would be ABC123.jpg
> and ABC456.jpg
> The database task at hand is matching products to images (and then inserting the image
> name into a column of the products table).


I guessed right then. The origin of your problem is that you have similar items in your database, but the database doesn't know they are similar. I'd fix that first, it makes the problem a whole lot easier to handle.

For example, if an image comes in named ABC123G.jpg, you look up the product and manufacturer and update its image. Then you query for products of the same manufacturer that are similar to ABC123G (result: ABC123 and ABC123X) and update their images as well (if appropriate; perhaps they have a recent enough image of their own?).


As another whacky alternative to your regular expressions; I think it would be possible to abuse the text-search functionality in Postgres to match product id's. Those id's are basically a language per manufacturer describing product details.

If you can split the product id's up into lexemes that describe the base product id and it's options, then you can use full-text search to match up expressions similar to the lexemes derived from the image name.

For example:
 productid | lexemes
-----------+------------------
 ABC123    | {'ABC' '123'}
 ABC123G   | {'ABC' '123' 'G'}
 ABC123X   | {'ABC' '123' 'X'}
 ABC456    | {'ABC' '456'}

I'm not really sure if that's possible, or how much work it would be per manufacturer - I haven't used FTS much.

I'd first see if I couldn't add that similarity information to the products table, though ;)

Alban Hertroys

--
The scale of a problem often equals the size of an ego.



-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux