On Sat, 2016-12-03 at 13:08 -0500, Kiriakos Georgiou wrote: > The array_agg() has nothing to do with it. It’s the group by. > Without knowing what you are conceptually trying to accomplish, I > can’t say much. It *IS* caused by array_agg(). PostgreSQL can only do HashAggregate when everything fits into memory, and in this case has to deal with aggregate states of unknown size, so assumes each state is 1kB IIRC. Per the plan the group by is expected to produce ~27k groups, so needs about 30MB for the HashAggregate. > On my test 9.4.10 db, a similar example does a HashAggregate, so no > sorting (google HashAggregate vs GroupAggregate). But still it’s an > expensive query because of all the I/O. The query does almost no I/O, actually. The bitmap heap scan takes only ~230ms, which is not bad considering it produces ~1M rows. The expensive part here seems to be the sort, but I doubt it's because of I/O because it only produces temporary files that likely stay in RAM anyway. So the sort is probably slow because of CPU, as it compares strings. In some locales that may be very expensive - not sure which locale is used in this case, as it was not mentioned. > If I wanted to instantly have the user ids for a specific first, last > name and category combo, I’d maintain a summary table via an insert > trigger on the users table. > Maybe. The question is whether it'll be a net win - maintenance of the summary table will not be for free, especially with arrays of ids. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general