Search Postgresql Archives

Re: Finding matching words in a word game

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Mar 5, 2013 at 8:29 PM, Alexander Farber
<alexander.farber@xxxxxxxxx> wrote:
> is there maybe a clever way of finding all possible words
> from a given set of letters by means of PostgreSQL
> (i.e. inside the database vs. scanning all database
> rows by a PHP script, which would take too long) -
> if the dictionary is kept in a simple table like:
>
> create table good_words (
>         word varchar(16) primary key,
>         stamp timestamp default current_timestamp
> );

How many words are you looking at, in your dictionary? I wrote an
anagramming program in C++ that works off a fairly small dictionary of
common words (~60K words) and gives adequate performance without any
indexing - the time is dwarfed by just scrolling the text up the
console. The only thing I'd recommend doing differently is the
language - use one that has a proper hash/tree type, saving you the
trouble of implementing one (I implemented my own non-balancing binary
tree for the task... no wait, on examination, it seems to actually be
a linear search - and yet it has passable performance). PHP can quite
probably do everything you want here; otherwise, I'd recommend
something like Python or Pike.

Simple Python example:

words = {}
for word in dictionary:   # provide a dictionary somehow - maybe from a file/db
    words.setdefault(''.join(sorted(word)),[]).append(word)
# Voila! You now have your mapping. One-off initialization complete.

find_anagrams_of = "stop"
anagrams = words.get(''.join(sorted(find_anagrams_of)),[])
# anagrams is now a list of all known anagrams of the target -
possibly an empty list
print(anagrams)


['opts', 'post', 'pots', 'spot', 'stop', 'tops']

On my laptop, loading ~100K words took about 1 second, and the lookup
took effectively no time. I don't think there's any need for a heavy
database engine here, unless you're working with millions and millions
of words :)

Chris Angelico


-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux