On Tue, Jun 25, 2019 at 11:49:03AM -0400, Hugh Ranalli wrote: > I'm hoping people can help me figure out where to look to solve an odd > PostgreSQL performance problem. What kernel? Version? OS? If Linux, I wonder if transparent hugepages or KSM are enabled ? It seems possible that truncating the table is clearing enough RAM to mitigate the issue, similar to restarting the DB. tail /sys/kernel/mm/ksm/run /sys/kernel/mm/transparent_hugepage/khugepaged/defrag /sys/kernel/mm/transparent_hugepage/enabled /sys/kernel/mm/transparent_hugepage/defrag https://www.postgresql.org/message-id/20170718180152.GE17566%40telsasoft.com 11.2 would have parallel query, and enabled by default. Are there other settings you've changed (or not changed)? https://wiki.postgresql.org/wiki/Server_Configuration It's possible that the "administrative" queries are using up lots of your shared_buffers, which are (also/more) needed by the customer-facing queries. I would install pg_buffercache to investigate. Or, just pause the admin queries and see if that the issue goes away during that interval ? SELECT 1.0*COUNT(1)/sum(count(1))OVER(), COUNT(1), COUNT(nullif(isdirty,'f')), datname, COALESCE(c.relname, b.relfilenode::text), d.relname TOAST, 1.0*COUNT(nullif(isdirty,'f'))/count(1) dirtyfrac, avg(usagecount) FROM pg_buffercache b JOIN pg_database db ON b.reldatabase=db.oid LEFT JOIN pg_class c ON b.relfilenode=pg_relation_filenode(c.oid) LEFT JOIN pg_class d ON c.oid=d.reltoastrelid GROUP BY 4,5,6 ORDER BY 1 DESC LIMIT 9; Could you send query plan for the slow (customer-facing) queries? https://wiki.postgresql.org/wiki/Slow_Query_Questions#EXPLAIN_.28ANALYZE.2C_BUFFERS.29.2C_not_just_EXPLAIN > A bit of background: We have a client with a database of approximately 450 > GB, that has a couple of tables storing large amounts of text, including > full HTML pages from the Internet. Last fall, they began experiencing > dramatic and exponentially decreasing performance. We track certain query > times, so we know how much time is being spent in calls to the database for > these functions. When this began, the times went from about an average of > approximate 200 ms to 400 ms, rapidly climbing each day before reaching 900 > ms, figures we had never seen before, within 4 days, with no appreciable > change in usage. It was at this point that we restarted the database server > and times returned to the 400 ms range, but never back to their > long-running original levels. From this point onward, we had to restart the > database (originally the server, but eventually just the database process) > every 3-4 days, otherwise the application became unusable. > > As they were still on PostgreSQL 8.2, we persuaded them to finally > undertake our long-standing recommendation to upgrade, as there was no > possibility of support on that platform. That upgrade to 11.2 was completed > successfully in mid-May, and although times have not returned to their > original levels (they now average approximately 250 ms), the application > overall seems much more responsive and faster (application servers were not > changed, other than minor changes --full text search, explicit casts, > etc.-- to conform to PostgreSQL 11's requirements). > > What we continued to notice was a milder but still definite trend of > increased query times, during the course of each week, from the mid to high > 200 ms, to the high 300 ms to low 400 ms. Some years ago, someone had > noticed that as the number of "raw_page" columns in a particular table > grew, performance would decline. They wrote a script that once a week locks > the table, deletes the processed large columns (they are not needed after > processing), copies the remaining data to a backup table, truncates the > original table, then copies it back. When this script runs we see an > immediate change in performance, from 380 ms in the hour before the drop, > to 250 ms in the hour of the drop. As rows with these populated columns are > added during the course of a week, the performance drops, steadily, until > the next week's cleaning operation. Each week the performance increase is > clear and significant. > > What is perplexing is (and I have triple checked), that this table is *not* > referenced in any way in the queries that we time (it is referenced by > ongoing administrative and processing queries). The operation that cleans > it frees up approximately 15-20 GB of space each week. Our system > monitoring shows this change in free disk space, but this is 20 GB out of > approximately 300 GB of free space (free space is just under 40% of volume > size), so disk space does not seem to be an issue. The table in question is > about 21 GB in size, with about 20 GB in toast data, at its largest. > > Even odder, the queries we time *do* reference a much larger table, which > contains very similar data, and multiple columns of it. It is 355 GB in > size, with 318 GB in toast data. It grows continually, with no cleaning. > > If anyone has any suggestions as to what sort of statistics to look at, or > why this would be happening, they would be greatly appreciated.