On Tue, Jul 9, 2013 at 5:04 AM, Robert James <srobertjames@xxxxxxxxx> wrote: > On 7/8/13, hubert depesz lubaczewski <depesz@xxxxxxxxxx> wrote: >> On Mon, Jul 08, 2013 at 09:09:26AM -0400, Robert James wrote: >>> I have two relations, where each relation has two fields, one >>> indicating a name and one indicating a position. That is, each >>> relation defines a sequence. >>> >>> I need to determine their longest common subsequence. Yes, I can do >>> this by fetching all the data into Java (or any other language) and >>> computing it using the standard LCS dynamic programming language. But >>> I'd like to stay within Postgres. Is there any way to do this? >> >> I'm not entirely sure I understand. Can you show us some sample data and >> expected output? > > Sure. Borrowing a good example from > http://wordaligned.org/articles/longest-common-subsequence : > > Table A (val varchar primary key, pos integer): > 1, "C" > 2, "H" > 3, "I" > 4, "M" > 5, "P" > 6, "A" > 7, "N" > 8, "Z" > 9, "E" > 10, "E" > > Table B (val varchar primary key, pos integer): > 1, "H" > 2, "U" > 3, "M" > 4, "A" > 5, "N" > > SELECT LongestCommonSubsequence(A,B): > 1, "H" > 2, "M" > 3, "A" > 4, "N" > > (Common chars are in upper case: > cHiMpANzee > HuMAN > ) > > The std dynamic programming algorithm is described at > http://en.wikipedia.org/wiki/Longest_common_subsequence_problem Your best bet on that would be to implement using either a procedural language (plpython, plperl that are in core, or even another one), or a C function and install it on server side with an extension. I'll do the latter if I were you. Roughly such a function would take in arguments the OIDs of the functions to compare (or their relation name), and return a set of rows describing the longest sequence found. -- Michael -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general