David Yeu <david.yeu@xxxxxxxxx> wrote: > We have indices against the primary key and the group_id. > Our queries essentially fall into the following cases: > > * Š WHERE group_id = ? ORDER BY created_at DESC LIMIT 20; > * Š WHERE group_id = ? AND id > ? ORDER BY created_at DESC; > * Š WHERE group_id = ? AND id < ? ORDER BY created_at DESC LIMIT > 20; > * Š WHERE group_id = ? ORDER BY created_at DESC LIMIT 20 OFFSET > ?; > > In human words, we're looking for: > > * The most recent (20) rows. > * The most recent rows after a given `id'. > * Twenty rows before a given `id'. > * Pages of twenty rows. The first thing I would try is building an index (perhaps CONCURRENTLY to avoid disrupting production) on (group_id, created_at). It might also be worth creating an index on (group_id, id, created_at), but that's a less-sure win. > Originally, this table was part of our primary database, but > recently we saw queries take upwards of thirty seconds or more to > complete. Since we're serving web requests, that's basically > unacceptable, and caused a lot of requests to backup. With only the indexes you mention, it had to be doing either complete table scans for each request, or a lot of random access to rows it didn't need. > Our interim solution has been to simply carve out a new database > that hosts only this table, and that has worked to some degree. We > aren't seeing thirty seconds plus database response times anymore, > but some queries still take many seconds and the cost of spinning > up a new master-slave configuration hasn't been cheap. Well, throwing hardware at something doesn't generally hurt, but it's not the first solution I would try, especially when the product you're using has ways to tune performance. > In the meantime, we're hoping to investigate other ways to > optimize this table and the queries against it. Heroku's data team > has suggested balling up these rows into arrays, where a single > row would represent a group_id, and the data would occupy a single > column as an array. Ugh. You're a long way from needing to give up on the relational model here. > And finally, we're also trying out alternative stores, since it > seems like this data and its retrieval could be well suited to > document-oriented backends. Redis and DynamoDB are currently the > best contenders. Your current use of PostgreSQL is more or less equivalent to driving a car around in first gear. You might consider a tuned PostgreSQL as another alternative store. :-) -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance