Re: Vacuum not cleaning up rows.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I took a look at the oldest query, xact & backend times in pg_stat_activity, but did not see any old dates, as of now (2019-06-20 13:54:12.041426-04) nothing is even an hour old as a start value:




=# select state, backend_start, xact_start, query_start from pg_stat_activity where datname = 'problem_database';
 state  |         backend_start         |          xact_start           |          query_start
--------+-------------------------------+-------------------------------+-------------------------------
 idle   | 2019-06-20 13:28:02.342849-04 |                               | 2019-06-20 13:41:30.561416-04
 idle   | 2019-06-20 13:40:33.578012-04 |                               | 2019-06-20 13:40:33.861842-04
 idle   | 2019-06-20 13:33:06.638612-04 |                               | 2019-06-20 13:41:30.56762-04
 idle   | 2019-06-20 13:38:06.549275-04 |                               | 2019-06-20 13:41:30.59876-04
 idle   | 2019-06-20 13:28:39.431864-04 |                               | 2019-06-20 13:41:30.462939-04
 idle   | 2019-06-20 13:38:02.583636-04 |                               | 2019-06-20 13:41:30.078732-04
 active | 2019-06-20 13:39:09.761125-04 | 2019-06-20 13:39:10.058196-04 | 2019-06-20 13:39:10.058196-04
 idle   | 2019-06-20 13:41:23.021731-04 |                               | 2019-06-20 13:41:23.289443-04
 idle   | 2019-06-20 13:28:10.023462-04 |                               | 2019-06-20 13:41:30.563567-04
 active | 2019-06-20 13:23:21.697794-04 | 2019-06-20 13:23:44.26898-04  | 2019-06-20 13:23:44.26898-04
 active | 2019-06-20 13:41:01.806997-04 | 2019-06-20 13:41:02.203739-04 | 2019-06-20 13:41:02.203739-04
 active | 2019-06-20 13:35:38.363437-04 | 2019-06-20 13:41:30.642168-04 | 2019-06-20 13:41:30.642168-04
(12 rows)





If I run a VACUUM VERBOSE it still claims it cannot remove 1,805,636 row versions:

=# vacuum verbose problem_table;
INFO:  vacuuming "problem_table"
INFO:  index "problem_table_pk" now contains 1792227 row versions in 17224 pages
DETAIL:  0 index row versions were removed.
3 index pages have been deleted, 0 are currently reusable.
CPU 0.03s/0.02u sec elapsed 0.05 sec.
INFO:  index "problem_table_1_idx" now contains 1792227 row versions in 17254 pages
DETAIL:  0 index row versions were removed.
4 index pages have been deleted, 0 are currently reusable.
CPU 0.02s/0.02u sec elapsed 0.04 sec.
INFO:  index "problem_table_2_idx" now contains 1792228 row versions in 4943 pages
DETAIL:  0 index row versions were removed.
1 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.01 sec.
INFO:  index "problem_table_3_idx" now contains 1792228 row versions in 9164 pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.01s/0.00u sec elapsed 0.02 sec.
INFO:  "problem_table": found 0 removable, 1805636 nonremovable row versions in 34770 out of 34770 pages
DETAIL:  1803634 dead row versions cannot be removed yet.
There were 697 unused item pointers.
Skipped 0 pages due to buffer pins.
0 pages are entirely empty.
CPU 0.17s/0.26u sec elapsed 0.43 sec.
VACUUM




I also checked the current tx id and pulled a list of (a) pg_stat_activity rows including backend_xid and backend_xmin and (b) a select from the table in question including the oldest xmin values:

=#  select txid_current();
 txid_current
--------------
    141325566
(1 row)




=# select state, backend_start, xact_start, query_start, backend_xid, backend_xmin from pg_stat_activity where datname = 'problem_database';
 state  |         backend_start         |          xact_start           |          query_start          | backend_xid | backend_xmin
--------+-------------------------------+-------------------------------+-------------------------------+-------------+--------------
 idle   | 2019-06-20 13:28:02.342849-04 |                               | 2019-06-20 13:41:44.71051-04  |             |
 idle   | 2019-06-20 13:40:33.578012-04 |                               | 2019-06-20 13:40:33.861842-04 |             |
 idle   | 2019-06-20 13:33:06.638612-04 |                               | 2019-06-20 13:41:44.701796-04 |             |
 idle   | 2019-06-20 13:38:06.549275-04 |                               | 2019-06-20 13:41:44.652429-04 |             |
 idle   | 2019-06-20 13:28:39.431864-04 |                               | 2019-06-20 13:41:44.234558-04 |             |
 idle   | 2019-06-20 13:38:02.583636-04 |                               | 2019-06-20 13:41:42.000154-04 |             |
 idle   | 2019-06-20 13:41:23.021731-04 |                               | 2019-06-20 13:41:23.289443-04 |             |
 idle   | 2019-06-20 13:28:10.023462-04 |                               | 2019-06-20 13:41:41.832205-04 |             |
 active | 2019-06-20 13:23:21.697794-04 | 2019-06-20 13:23:44.26898-04  | 2019-06-20 13:23:44.26898-04  |             |    141309498
 active | 2019-06-20 13:41:01.806997-04 | 2019-06-20 13:41:02.203739-04 | 2019-06-20 13:41:02.203739-04 |             |    141324650
 active | 2019-06-20 13:35:38.363437-04 | 2019-06-20 13:41:44.908629-04 | 2019-06-20 13:41:44.908629-04 |             |    141325259
(11 rows)




t=# select xmin::text, xmax, primary_id from problem_table order by 1;
   xmin    |   xmax    |              primary_id
-----------+-----------+--------------------------------------
 141306153 |         0 | 7aae6212-854b-428c-9a87-d07a178387dc
 141306169 |         0 | 8b9abcd7-a683-4ae3-af1b-e51fa373b836
 141306174 |         0 | f63943c0-3119-4b0b-96e8-d19d04fe48dc
 141306204 |         0 | 95e81769-5ad4-4285-b9dc-a7e2360f1fa2
 141306205 |         0 | 0c6fb4c1-4f17-490c-9e7e-ec7f90edd094
 141306224 | 141306224 | 05c833b6-efc9-4faf-8b3a-760b03d43abc
 141306245 |         0 | 9b0959de-2b3b-474a-ab72-4664f7c1d850
 141306254 |         0 | 0cedd9ef-de2a-4d44-bd28-50587a16174a
 141306264 |         0 | 70145477-26bc-4e2b-b51e-5533bcce0658
 141306265 |         0 | 1b312489-96df-4b32-9d4e-4f872dca1f86


I also ran a select * from pg_prepared_xacts which returned no rows



Thanks in advance for any additional help / direction...




On 6/19/19 5:50 PM, Tom Lane wrote:
"S. Bob" <sbob@xxxxxxxxxxxxxxxxxxxxx> writes:
we have a table that consistently has 2x the number of dead rows than 
live rows.
I've done some digging and it looks like somehow the application is 
holding shared locks that force vacuum to not cleanup rows.
This is not a locking issue; if it were, vacuum would be unable to
scan the table at all.

What it is is a snapshot issue: that is, some transaction has a snapshot
that's old enough to allow it to see those dead rows if it were to look.
So vacuum can't remove the rows for fear of causing transactional
consistency failures.

What you want to do is look into pg_stat_activity for transactions with
very old start times, and then cancel them or otherwise force them to
terminate.  Also, if there are no obvious candidates there, you might
be having a problem with prepared transactions --- look into
pg_prepared_xacts to see if there are any old ones of those.

			regards, tom lane

[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux