On 6 February 2014 16:18, alexandros_e <alexandros.ef@xxxxxxxxx> wrote: > Let's say I have this table foo > > ID|G1|T1| > 1|2|ABC| > 1|2|ABCD| > 1|2|DEF| > 1|2|DEFG| > > SELECT * FROM foo > GROUP BY ID,G1,T1 > Is there a way in SQL or PostgreSQL in general to group by values than are > not exactly the same but are quite similar (like 'ABC' and 'ABCD') based on > some distance function (levenshtein for example) if the distance is within > some threshold (i.e., 1) Perhaps there is: You can calculate the levenshtein distance between those values using a self-join and then GROUP BY the result of that expression and limit the results with HAVING. For example: SELECT foo1.ID, foo1.G1, foo1.T1, levenshtein(foo1.T1, foo2.T1) FROM foo foo1 INNER JOIN foo foo2 ON (foo2.ID = foo1.ID AND foo2.G1 = foo1.G1) GROUP BY foo1.ID, foo1.G1, foo1.T1, levenshtein(foo1.T1, foo2.T1) HAVING levenshtein(foo1.T1, foo2.T1) > 1 Is that what you're looking for? -- If you can't see the forest for the trees, Cut the trees and you'll see there is no forest. -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general