On Wed, Jun 26, 2019 at 03:00:43PM -0400, Hugh Ranalli wrote: > Pausing the admin queries isn't an option in our environment, especially as > the issue reveals itself over the course of days, not minutes or hours. Perhaps you can pause it for a short while at EOW and see if there's a dramatic improvement ? > ?column? | count | count | datname | coalesce | toast | dirtyfrac | avg > ------------------------+---------+-------+-----------+-------------------------+----------------+----------------------------+-------------------- > 0.24904101286779650995 | 1044545 | 0 | mydb | position | | 0.000000000000000000000000 | 4.8035517857057379 > 0.16701241622795295199 | 700495 | 0 | mydb | stat_position_click | | 0.000000000000000000000000 | 1.9870234619804567 > 0.09935032779251879171 | 416702 | 6964 | mydb | pg_toast_19788 | harvested_job | 0.01671218280689797505 | 1.9346079452462431 > 0.06979762146872315533 | 292750 | 0 | mydb | url | | 0.000000000000000000000000 | 4.9627873612297182 > 0.03795774662998486745 | 159205 | 0 | mydb | stat_sponsored_position | | 0.000000000000000000000000 | 1.8412361420809648 > 0.02923155381784048663 | 122605 | 0 | mydb | pg_toast_20174 | page | 0.000000000000000000000000 | 3.0259532645487541 > 0.02755283459406156353 | 115564 | 0 | mydb | location | | 0.000000000000000000000000 | 4.9953532241874632 > 0.02015273698468076320 | 84526 | 1122 | mydb | harvested_job | | 0.01327402219435439983 | 4.9922154130090150 > 0.01913348905375406298 | 80251 | 0 | mydb | pg_toast_20257 | position_index | 0.000000000000000000000000 | 4.9880001495308470 > > harvested_job is the rapidly growing "problematic" table I am talking > about. page is the 355 GB table that gets referenced on the public > searches. I'll google, but is there a place I should look to understand > what I am seeing here? I should label the columns: |buffer_fraction | nbuffers| ndirty| datname | relname | toast | dirtyfrac | avgusage It looks like possibly harvested job is being index scanned, and its toast table is using up many buffers. At the EOW, maybe that number is at the expense of more important data. You could check pg_stat_user_tables/indexes for stats on that. Possibly you could make use of index-only scans using covering indexes (pg11 supports INCLUDE). Or maybe it's just too big (maybe it should be partitioned or maybe index should be repacked?) > Also, Should pg_buffercache perhaps be run at the beginning and end of the > week, to see if there is a significant difference? Yes; buffercache can be pretty volatile, so I'd save it numerous times each at beginning and end of week. > > Could you send query plan for the slow (customer-facing) queries? > > > > https://wiki.postgresql.org/wiki/Slow_Query_Questions#EXPLAIN_.28ANALYZE.2C_BUFFERS.29.2C_not_just_EXPLAIN > > I can, but can I ask why this would matter? My very tentative guess is that harvested_job itself isn't the issue, but some other, 3rd thing is the issue, which also increases (at least roughly) with time, same as that table. It'd help to see the buffer cache hit rate for that query (and its different query plan nodes), at beginning and EOW. Justin