Surprising no use of indexes - low performance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

I've been struggling to understand what's happening on my databases/query for several days, and I'm turning to higher mind for a logical answer.

I'm dealing with a fairly large database, containing logs informations, that I crunch to get data out of it, with several indexes on them that I hoped were logical

\d ruddersysevents
                                 Table « public.ruddersysevents »
Colonne | Type | Modificateurs
--------------------+--------------------------+--------------------------------------------------
id | integer | non NULL Par défaut, nextval('serial'::regclass)
 executiondate      | timestamp with time zone | non NULL
 nodeid             | text                     | non NULL
 directiveid        | text                     | non NULL
 ruleid             | text                     | non NULL
 serial             | integer                  | non NULL
 component          | text                     | non NULL
 keyvalue           | text                     |
 executiontimestamp | timestamp with time zone | non NULL
 eventtype          | character varying(64)    |
 policy             | text                     |
 msg                | text                     |
Index :
    "ruddersysevents_pkey" PRIMARY KEY, btree (id)
    "component_idx" btree (component)
    "configurationruleid_idx" btree (ruleid)
    "executiontimestamp_idx" btree (executiontimestamp)
    "keyvalue_idx" btree (keyvalue)
    "nodeid_idx" btree (nodeid)
Contraintes de vérification :
    "ruddersysevents_component_check" CHECK (component <> ''::text)
    "ruddersysevents_configurationruleid_check" CHECK (ruleid <> ''::text)
    "ruddersysevents_nodeid_check" CHECK (nodeid <> ''::text)
"ruddersysevents_policyinstanceid_check" CHECK (directiveid <> ''::text)


It contains 11018592 entries, with the followinf patterns :
108492 distinct executiontimestamp
14 distinct nodeid
59 distinct directiveid
26 distinct ruleid
35 distinct serial

Related table/index size are
                relation                |  size
----------------------------------------+---------
 public.ruddersysevents                 | 3190 MB
 public.nodeid_idx                      | 614 MB
 public.configurationruleid_idx         | 592 MB
 public.ruddersysevents_pkey            | 236 MB
 public.executiontimestamp_idx          | 236 MB


I'm crunching the data by looking for each nodeid/ruleid/directiveid/serial with an executiontimestamp in an interval:

explain analyze select executiondate, nodeid, ruleid, directiveid, serial, component, keyValue, executionTimeStamp, eventtype, policy, msg from RudderSysEvents where 1=1 and nodeId = '31264061-5ecb-4891-9aa4-83824178f43d' and ruleId = '61713ff1-aa6f-4c86-b3cb-7012bee707dd' and serial = 10 and executiontimestamp between to_timestamp('2012-11-22 16:00:16.005', 'YYYY-MM-DD HH24:MI:SS.MS') and to_timestamp('2013-01-25 18:53:52.467', 'YYYY-MM-DD HH24:MI:SS.MS') ORDER BY executionTimeStamp asc; Sort (cost=293125.41..293135.03 rows=3848 width=252) (actual time=28628.922..28647.952 rows=62403 loops=1)
   Sort Key: executiontimestamp
   Sort Method:  external merge  Disk: 17480kB
-> Bitmap Heap Scan on ruddersysevents (cost=74359.66..292896.27 rows=3848 width=252) (actual time=1243.150..28338.927 rows=62403 loops=1) Recheck Cond: ((nodeid = '31264061-5ecb-4891-9aa4-83824178f43d'::text) AND (ruleid = '61713ff1-aa6f-4c86-b3cb-7012bee707dd'::text)) Filter: ((serial = 10) AND (executiontimestamp >= to_timestamp('2012-11-22 16:00:16.005'::text, 'YYYY-MM-DD HH24:MI:SS.MS'::text)) AND (executiontimestamp <= to_timestamp('2013-01-25 18:53:52.467'::text, 'YYYY-MM-DD HH24:MI:SS.MS'::text))) -> BitmapAnd (cost=74359.66..74359.66 rows=90079 width=0) (actual time=1228.610..1228.610 rows=0 loops=1) -> Bitmap Index Scan on nodeid_idx (cost=0.00..25795.17 rows=716237 width=0) (actual time=421.365..421.365 rows=690503 loops=1) Index Cond: (nodeid = '31264061-5ecb-4891-9aa4-83824178f43d'::text) -> Bitmap Index Scan on configurationruleid_idx (cost=0.00..48562.32 rows=1386538 width=0) (actual time=794.490..794.490 rows=1381391 loops=1) Index Cond: (ruleid = '61713ff1-aa6f-4c86-b3cb-7012bee707dd'::text)
 Total runtime: 28657.352 ms



I'm surprised that the executiontimestamp index is not used, since it seems to be where most of the query time is spent.

For all my tests, I removed all the incoming logs, so that this table has only selects and no writes

I'm using Postgres 8.4, on a quite smallish VM, with some process runnings, with the following non default configuration
shared_buffers = 112MB
work_mem = 8MB
maintenance_work_mem = 48MB
max_stack_depth = 3MB
wal_buffers = 1MB
effective_cache_size = 128MB
checkpoint_segments = 6

Increasing the shared_buffers to 384, 1GB or 1500MB didn't improve the performances (less than 10%). I would have expected it to improve, since the indexes would all fit in RAM Every explain analyze made after modification of configuration and indexes where done after a complete batch of crunch of logs by our apps, to be sure the stats were corrects

I tested further with the indexes, and reached these results :
1/ dropping the unique index "configurationruleid_idx" btree (ruleid), "executiontimestamp_idx" btree (executiontimestamp), "keyvalue_idx" btree (keyvalue), "nodeid_idx" btree (nodeid) to replace them by a unique index did lower the perfs :

create index composite_idx on ruddersysevents (executiontimestamp, ruleid, serial, nodeid);

Index Scan using composite_idx on ruddersysevents (cost=0.01..497350.22 rows=3729 width=252) (actual time=7.989..83717.901 rows=62403 loops=1) Index Cond: ((executiontimestamp >= to_timestamp('2012-11-22 16:00:16.005'::text, 'YYYY-MM-DD HH24:MI:SS.MS'::text)) AND (executiontimestamp <= to_timestamp('2013-01-25 18:53:52.467'::text, 'YYYY-MM-DD HH24:MI:SS.MS'::text)) AND (ruleid = '61713ff1-aa6f-4c86-b3cb-7012bee707dd'::text) AND (serial = 10) AND (nodeid = '31264061-5ecb-4891-9aa4-83824178f43d'::text))


2/ Removing nodeid from the index did lower again the perf

create index composite2_idx on ruddersysevents (executiontimestamp, ruleid, serial);

Index Scan using composite2_idx on ruddersysevents (cost=0.01..449948.98 rows=3729 width=252) (actual time=23.507..84888.349 rows=62403 loops=1) Index Cond: ((executiontimestamp >= to_timestamp('2012-11-22 16:00:16.005'::text, 'YYYY-MM-DD HH24:MI:SS.MS'::text)) AND (executiontimestamp <= to_timestamp('2013-01-25 18:53:52.467'::text, 'YYYY-MM-DD HH24:MI:SS.MS'::text)) AND (ruleid = '61713ff1-aa6f-4c86-b3cb-7012bee707dd'::text) AND (serial = 10))
   Filter: (nodeid = '31264061-5ecb-4891-9aa4-83824178f43d'::text)

3/ Removing executiontimestamp from the composite index makes the query performs better at the begining of its uses (around 17 secondes), but over time it degrades (I'm logging query longer than 20 secondes, and there are very rare in the first half of the batch, and getting more and more common at the end) to what is below

create index composite3_idx on ruddersysevents (ruleid, serial, nodeid);
Sort (cost=24683.44..24693.07 rows=3849 width=252) (actual time=60454.558..60474.013 rows=62403 loops=1)
   Sort Key: executiontimestamp
   Sort Method:  external merge  Disk: 17480kB
-> Bitmap Heap Scan on ruddersysevents (cost=450.12..24454.23 rows=3849 width=252) (actual time=146.065..60249.143 rows=62403 loops=1) Recheck Cond: ((ruleid = '61713ff1-aa6f-4c86-b3cb-7012bee707dd'::text) AND (serial = 10) AND (nodeid = '31264061-5ecb-4891-9aa4-83824178f43d'::text)) Filter: ((executiontimestamp >= to_timestamp('2012-11-22 16:00:16.005'::text, 'YYYY-MM-DD HH24:MI:SS.MS'::text)) AND (executiontimestamp <= to_timestamp('2013-01-25 18:53:52.467'::text, 'YYYY-MM-DD HH24:MI:SS.MS'::text))) -> Bitmap Index Scan on composite3_idx (cost=0.00..449.15 rows=6635 width=0) (actual time=129.102..129.102 rows=62403 loops=1) Index Cond: ((ruleid = '61713ff1-aa6f-4c86-b3cb-7012bee707dd'::text) AND (serial = 10) AND (nodeid = '31264061-5ecb-4891-9aa4-83824178f43d'::text))
 Total runtime: 60484.022 ms


And with all these tests, the difference of the whole database processing with each index time is within 10% error margin (2h10 to 2h25), even if most of the time is spent on the SQL query.

So my question is :
"Why *not* indexing the column which is not used makes the query slower over time, while not slowing the application?"

If you have some clues, I'm really interested.

Best regards
Nicolas



--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance



[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux