Search Postgresql Archives

Re: Postgresql GROUP BY "SIMILAR" but not equal values

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 6 February 2014 16:18, alexandros_e <alexandros.ef@xxxxxxxxx> wrote:
> Let's say I have this table foo
>
> ID|G1|T1|
> 1|2|ABC|
> 1|2|ABCD|
> 1|2|DEF|
> 1|2|DEFG|
>
> SELECT * FROM foo
> GROUP BY ID,G1,T1

> Is there a way in SQL or PostgreSQL in general to group by values than are
> not exactly the same but are quite similar (like 'ABC' and 'ABCD') based on
> some distance function (levenshtein for example) if the distance is within
> some threshold (i.e., 1)

Perhaps there is: You can calculate the levenshtein distance between
those values using a self-join and then GROUP BY the result of that
expression and limit the results with HAVING.

For example:
SELECT foo1.ID, foo1.G1, foo1.T1, levenshtein(foo1.T1, foo2.T1)
FROM foo foo1
INNER JOIN foo foo2 ON (foo2.ID = foo1.ID AND foo2.G1 = foo1.G1)
GROUP BY foo1.ID, foo1.G1, foo1.T1, levenshtein(foo1.T1, foo2.T1)
HAVING levenshtein(foo1.T1, foo2.T1) > 1

Is that what you're looking for?

-- 
If you can't see the forest for the trees,
Cut the trees and you'll see there is no forest.


-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux