Try to use single regular expression. 2011/8/10, Grzegorz Blinowski <g.blinowski@xxxxxxxxx>: > Dear All, > > I have some problems with regexp queries performance - common sense tells me > that my queries should run faster than they do. > > The database - table in question has 590 K records, table's size is 3.5GB. I > am effectively querying a single attribute "subject" which has an average > size of 2KB, so we are doing a query on ~1GB of data. The query looks more > or less like this: > > SELECT T.tender_id FROM archive_tender T WHERE > (( T.subject !~* '\\mpattern1.*\\M' ) AND ( T.subject ~* '\\mpattern2\\M' OR > [4-5 more similar terms] ) AND T.erased = 0 AND T.rejected = 0 > ORDER BY > tender_id DESC > LIMIT > 10000; > > The planner shows seq scan on subject which is OK with regexp match. > > Now, the query above takes about 60sec to execute; exactly: 70s for the > first run and 60s for the next runs. In my opinion this is too long: It > should take 35 s to read the whole table into RAM (assuming 100 MB/s > transfers - half the HDD benchmarked speed). With 12 GB of RAM the whole > table should be easily buffered on the operating system level. The regexp > match on 1 GB of data takes 1-2 s (I benchmarked it with a simple pcre > test). The system is not in the production mode, so there is no additional > database activity (no reads, no updates, effectively db is read-only) > > To summarize: any idea how to speed up this query? (please, don't suggest > regexp indexing - in this application it would be too time consuming to > implement them, and besides - as above - I think that Postgres should do > better here even with seq-scan). > > Server parameters: > RAM: 12 GB > Cores: 8 > HDD: SATA; shows 200 MB/s transfer speed > OS: Linux 64bit; Postgres 8.4 > > > Some performance params from postgresql.conf: > max_connections = 16 > shared_buffers = 24MB > temp_buffers = 128MB > max_prepared_transactions = 50 > work_mem = 128MB > maintenance_work_mem = 1GB > effective_cache_size = 8GB > > Database is vacuumed. > > > Regards, > > Greg > -- ------------ pasman -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance