Table "public.syncerevent"
Column | Type | Modifiers
--------------+---------+----------------------------------------------------------
id | bigint | not null default nextval('syncerevent_id_seq'::regclass)
userid | text |
event | text |
eventid | text |
originatorid | text |
propogatorid | text |
kwargs | text |
conflicted | integer |
Indexes:
"syncerevent_pkey" PRIMARY KEY, btree (id)
"syncereventidindex" UNIQUE, btree (eventid)
"anothersyncereventidindex" btree (userid)
"anothersyncereventidindexwithascending" btree (userid, id)
"asdfasdgasdf" btree (userid, id DESC)
"syncereventuseridhashindex" hash (userid)
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.43..1218.57 rows=4000 width=615) (actual time=3352.390..3403.572 rows=4000 loops=1)
Buffers: shared hit=120244 read=160198
-> Index Scan using syncerevent_pkey on syncerevent (cost=0.43..388147.29 rows=1274560 width=615) (actual time=3352.386..3383.100 rows=4000 loops=1)
Index Cond: (id > 12468)
Filter: ((propogatorid <> '"d8130ab9!-66d0!-4f13!-acec!-a9556362f0ad"'::text) AND (conflicted <> 1) AND (userid = '57dc984f1c87461c0967e228'::text))
Rows Removed by Filter: 1685801
Buffers: shared hit=120244 read=160198
Planning time: 0.833 ms
Execution time: 3407.633 ms
(9 rows)
The postgres verison is: PostgreSQL 9.5.2 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.2 20140120 (Red Hat 4.8.2-16), 64-bit
This query has gotten slower over time.
The postgres server is running on a db.m3.medium RDS instance on Amazon.
(3.75GB of ram)
(~3 GHz processor, single core)
I ran VACUUM, and ANALYZEd this table just prior to running the EXPLAIN command.
Here are the server settings:
name | current_setting | source
application_name | psql | client
archive_command | /etc/rds/dbbin/pgscripts/rds_wal_archive %p | configuration file
archive_mode | on | configuration file
archive_timeout | 5min | configuration file
autovacuum_analyze_scale_factor | 0.05 | configuration file
autovacuum_naptime | 30s | configuration file
autovacuum_vacuum_scale_factor | 0.1 | configuration file
checkpoint_completion_target | 0.9 | configuration file
client_encoding | UTF8 | client
effective_cache_size | 1818912kB | configuration file
fsync | on | configuration file
full_page_writes | on | configuration file
hot_standby | off | configuration file
listen_addresses | * | command line
lo_compat_privileges | off | configuration file
log_checkpoints | on | configuration file
log_directory | /rdsdbdata/log/error
Sorry for the formatting, I'm not sure of the best way to format this data on a mailing list.
If it matters/interests you, here is my underlying confusion:
From some internet sleuthing, I've decided that having a table per user (which would totally make this problem a non-issue) isn't a great idea. Because there is a file per table, having a table per user would not scale. My next thought was partial indexes (which would also totally help), but since there is also a table per index, this really doesn't side-step the problem. My rough mental model says: If there exists a way that a table-per-user scheme would make this more efficient, then there should also exist an index that could achieve the same effect (or close enough to not matter). I would think that "userid = '57dc984f1c87461c0967e228'" could utilize at least one of the two indexes on the userId column, but clearly I'm not understanding something.
Any help in making this query more efficient would be greatly appreciated, and any conceptual insights would be extra awesome.
Thanks for reading.
-Jake