Search Postgresql Archives

Re: Working with huge amount of data. RESULTS!

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 12 Feb 2008, Mario Lopez wrote:

Hi!,

I optimized the LIKE 'keyword%' and LIKE '%keyword' with the following results:

# time /Library/PostgreSQL8/bin/psql -U postgres -d testdb -c "select * from table1 where varchar_reverse(data) like varchar_reverse('%keyword');"
real    0m0.055s
user    0m0.011s
sys     0m0.006s


# time /Library/PostgreSQL8/bin/psql -U postgres -d testdb -c "select * from table1 where data like 'keyword%';"
real    0m0.026s
user    0m0.012s
sys     0m0.006s

It works flawlesly as you can see by the timings, take in consideration that "table1" has 100 million records. The only problem is generating the reversed index which takes like 20 minutes, I guess it has to do with the plperl function, perhaps a C function for inverting would make it up in less time.

The problem is still with the LIKE '%keyword%', my problem is that I am not searching for Words in a dictionary fashion, suppose my "data" is random garbage, that it has common consecutive bytes. How could I generate a dictionary from this random garbage to make it easier for indexing?

suffix tree (array) would speedup '%keyword%' query, but currently it doesn't
supported by GiST extension architecture (we have it in our TODO), so I see several ways (not tested):

1. try contrib/pg_trgm to reduce a number of candidate words
2. generate all possible substrings and use your tested approach



On Mon, Feb 11, 2008 at 04:37:24PM +0100, Mario Lopez wrote:

SELECT * FROM names WHERE name LIKE keyword%
Or
SELECT * FROM names WHERE name LIKE %keyword%


check this:
http://www.depesz.com/index.php/2007/07/30/indexable-field-like-something/
and this:
http://www.depesz.com/index.php/2007/09/15/speeding-up-like-xxx/

depesz




---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
     choose an index scan if your joining column's datatypes do not
     match


	Regards,
		Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@xxxxxxxxxx, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux