On Tue, Mar 21, 2017 at 10:56 PM, Claudio Freire <klaussfreire@xxxxxxxxx> wrote: > On Tue, Mar 21, 2017 at 4:24 PM, James Parks <james.parks@xxxxxxxxxx> wrote: >> ... and here's how long it takes to read all of the rows: >> database=> select max(an unindexed bigint column) from a; >> -[ RECORD 1 ]-------- >> max | <some number> >> Time: 10624.368 ms >> >> Running this another time immediately afterward (to show the cached speed) >> returns: >> Time: 13782.363 ms >> >> If I go to a separate database cluster that has an equivalent schema, and >> roughly equivalent table a (+- 2% on the number of rows), the above queries >> look more like this: >> >> meraki_shard_production=> vacuum verbose a; >> INFO: vacuuming "public.a" >> INFO: index "a_pkey" now contains 42171 row versions in 162 pages >> DETAIL: 0 index row versions were removed. >> 0 index pages have been deleted, 0 are currently reusable. >> CPU 0.00s/0.00u sec elapsed 0.00 sec. >> INFO: "a": found 487 removable, 42286 nonremovable row versions in 7809 out >> of 7853 pages >> DETAIL: 373 dead row versions cannot be removed yet. >> There were 42436 unused item pointers. >> 0 pages are entirely empty. >> CPU 0.00s/0.02u sec elapsed 0.01 sec. >> INFO: vacuuming "pg_toast.pg_toast_19037" >> INFO: index "pg_toast_19037_index" now contains 57 row versions in 2 pages >> DETAIL: 0 index row versions were removed. >> 0 index pages have been deleted, 0 are currently reusable. >> CPU 0.00s/0.00u sec elapsed 0.00 sec. >> INFO: "pg_toast_19037": found 0 removable, 57 nonremovable row versions in >> 12 out of 12 pages >> DETAIL: 0 dead row versions cannot be removed yet. >> There were 0 unused item pointers. >> 0 pages are entirely empty. >> CPU 0.00s/0.00u sec elapsed 0.00 sec. >> VACUUM >> Time: 32.890 ms >> >> database=> select max(the same unindexed bigint column) from a; >> max >> ----------------- >> <some number> >> (1 row) >> Time: 16.696 ms >> (The second iteration takes 15.320 ms) >> >> So, the way I see it, my problem boils down to table "A" getting roughly >> 100-1000x slower when it gets roughly 20-50x bigger (depending if you >> measure in pages or tuples). Unfortunately, in my use case, table "A" acts >> as a join table for a lot of aspects of our company's webapp. Every 10 >> minutes, the table is queried for 35 million rows via sequential scan (~800 >> seq scans per minute, ~1.3 per second on average), and 6.5 million rows via >> index lookup. When a sequential scan over 40k rows takes less than 1 second, >> everything is fine -- when it takes 10+ seconds the database starts to slow >> down significantly. Thankfully, queries can share sequential scans, but you >> can imagine how the responsiveness of the webapp might suffer as a >> consequence. There's also the secondary effect that, should the query on B >> complete, there now exist many queries against A (and other related tables) >> that are slow enough to potentially increase the size of A even further. It >> is not uncommon for queries involving A to start taking upwards of 30 >> minutes to complete, when they usually complete in roughly 300ms, after some >> maintenance query against B has completed. >> >> Our go-to solution has been to detect and stop these maintenance queries if >> they take too long, and then to CLUSTER table A. This puts a cap on how long >> any maintenance query can take -- down to somewhere around 1 hour. >> >> And thus my query to you guys: >> >> What can I do to keep running long maintenance operations on large tables >> (SELECTing significant fractions of B, DELETEing significant fractions of B, >> running VACUUM FULL on B) without denying other Postgresql backends their >> ability to efficiently query table A? Or, in other words, how do I avoid >> incurring the cost of transaction isolation for queries against B on a >> case-by-case basis? >> >> Anything is on the table for implementation: >> - moving tables to a different database / cluster / completely different >> DBMS system >> - designing an extension to tune either sets of queries >> - partitioning tables >> - etc >> ... although the simpler the better. If you were in this position, what >> would you do? >> >> Regards, >> James > > You're experiencing bloat because the transaction on B is preventing > the xid horizon from moving forward, thus dead tuples from A cannot be > reclaimed in case the transaction on B decides to query them. > > There's only one "easy" solution for this as far as I know, and it is > to run your long-running queries on a hot standby. That certainly > works for most read-only workloads, especially pg_dump. Forgot to clarify... for your use case, make sure you *don't* enable standby feedback on the standby. -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance