On Mon, Jun 22, 2015 at 4:51 AM, Christian Ramseyer <rc@xxxxxxxxxxx> wrote:
Hi
I have a pretty large table with syslog messages.
It is already partitioned by month, and for a single month I have e.g.
DM=# \d+ logs_01
Column | Type |
--------------+-----------------------------+
host | character varying(255) |
facility | character varying(10) |
priority | character varying(10) |
tag | character varying(255) |
log_date | timestamp without time zone |
program | character varying(255) |
msg | text |
seq | bigint |
Indexes:
"logs_01_pkey" PRIMARY KEY, btree (seq)
"idx_logs_01_lower_host" btree (lower(host::text) varchar_pattern_ops)
"logs_01_date_index" btree (log_date)
"tridx_logs_01_msg" gin (msg gin_trgm_ops)
DM=# select count(*) from logs_01;
count
----------
83052864
...
A typical query on this table looks like this:
explain analyze
select log_date, host, msg
from logs_01 as log where log.msg like '%192.23.33.177%'
and log.log_date >= '2015-1-18 1:45:24'
and log.log_date <= '2015-1-19 1:45:24'
order by log_date asc offset 200 limit 50;
I think that trigram indexes are not well-suited to searching IP addresses.
If the typical query is always an IP address for the LIKE, I think you would want to build an index specifically tailored to that. You could make a function to parse the IP address out of the msg, and then make a functional index, for example. It would require you to write the query differently. Whether it would be a btree index or a gin index would depend on whether you can have more than one IP address in a msg.
Cheers,
Jeff