Hi All,
I'm trying to understand why there's a difference between what pgstattuple reports and pg_stat_user_tables reports (for the number of dead tuples).
As I understand, pgstattuple and pgstattuple_approx return the exact number of dead tuples (as noted in the documentation) and based on an older Stack Overflow answer the value returned from pg_stat_user_tables "uses the most recent data collected by ANALYZE".
Why would it be that even after analyzing a table the n_dead_tup value is still vastly different than dead_tuple_count?
> SELECT * FROM (SELECT dead_tuple_count from pgstattuple_approx('oban.oban_jobs'))a, (SELECT n_dead_tup,last_autovacuum,last_analyze,now(),autovacuum_c
ount FROM pg_stat_user_tables WHERE relname = 'oban_jobs' and schemaname = 'oban')b;
-[ RECORD 1 ]-------------------------
dead_tuple_count | 3736
n_dead_tup | 1127044
last_autovacuum | 2024-08-23 16:00:30.983141+00
last_analyze | 2024-08-23 15:33:50.628422+00
now | 2024-08-23 16:01:19.915893+00
autovacuum_count | 446478
SELECT 1
> vacuum (verbose,analyze) oban.oban_jobs;
vacuuming "oban.oban_jobs"
table "oban_jobs": index scan bypassed: 29341 pages from table (0.79% of total) have 1111747 dead item identifiers
launched 2 parallel vacuum workers for index cleanup (planned: 2)
index "oban_jobs_args_index" now contains 18281 row versions in 10232 pages
0 index row versions were removed.
0 index pages were newly deleted.
56 index pages are currently deleted, of which 833 are currently reusable.
CPU: user: 0.01 s, system: 0.00 s, elapsed: 0.01 s.
index "oban_jobs_meta_index" now contains 18281 row versions in 9698 pages
0 index row versions were removed.
0 index pages were newly deleted.
35 index pages are currently deleted, of which 621 are currently reusable.
CPU: user: 0.01 s, system: 0.00 s, elapsed: 0.01 s.
table "oban_jobs": found 855 removable, 9661 nonremovable row versions in 29341 out of 3727204 pages
1330 dead row versions cannot be removed yet, oldest xmin: 1378705314
Skipped 0 pages due to buffer pins, 3696951 frozen pages.
912 skipped pages using mintxid fork.
CPU: user: 0.12 s, system: 0.08 s, elapsed: 0.22 s.
vacuuming "pg_toast.pg_toast_72454950"
table "pg_toast_72454950": found 0 removable, 0 nonremovable row versions in 0 out of 0 pages
0 dead row versions cannot be removed yet, oldest xmin: 1378705314
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 skipped pages using mintxid fork.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
analyzing "oban.oban_jobs"
"oban_jobs": scanned 30000 of 3727204 pages, containing 75 live rows and 10501 dead rows; 75 rows in sample, 9318 estimated total rows
VACUUM
> SELECT * FROM (SELECT dead_tuple_count from pgstattuple_approx('oban.oban_jobs'))a, (SELECT n_dead_tup,last_autovacuum,last_analyze,now(),autovacuum_c
ount FROM pg_stat_user_tables WHERE relname = 'oban_jobs' and schemaname = 'oban')b;
-[ RECORD 1 ]-------------------------
dead_tuple_count | 1701
n_dead_tup | 1306009
last_autovacuum | 2024-08-23 16:01:31.034229+00
last_analyze | 2024-08-23 16:01:47.85574+00
now | 2024-08-23 16:01:55.734589+00
autovacuum_count | 446479
This is a Google Alloy DB instance running:
> select version();
-[ RECORD 1 ]-------------------------
version | PostgreSQL 14.10 on x86_64-pc-linux-gnu, compiled by Debian clang version 12.0.1, 64-bit
SELECT 1
As I understand, pgstattuple and pgstattuple_approx return the exact number of dead tuples (as noted in the documentation) and based on an older Stack Overflow answer the value returned from pg_stat_user_tables "uses the most recent data collected by ANALYZE".
Why would it be that even after analyzing a table the n_dead_tup value is still vastly different than dead_tuple_count?
> SELECT * FROM (SELECT dead_tuple_count from pgstattuple_approx('oban.oban_jobs'))a, (SELECT n_dead_tup,last_autovacuum,last_analyze,now(),autovacuum_c
ount FROM pg_stat_user_tables WHERE relname = 'oban_jobs' and schemaname = 'oban')b;
-[ RECORD 1 ]-------------------------
dead_tuple_count | 3736
n_dead_tup | 1127044
last_autovacuum | 2024-08-23 16:00:30.983141+00
last_analyze | 2024-08-23 15:33:50.628422+00
now | 2024-08-23 16:01:19.915893+00
autovacuum_count | 446478
SELECT 1
> vacuum (verbose,analyze) oban.oban_jobs;
vacuuming "oban.oban_jobs"
table "oban_jobs": index scan bypassed: 29341 pages from table (0.79% of total) have 1111747 dead item identifiers
launched 2 parallel vacuum workers for index cleanup (planned: 2)
index "oban_jobs_args_index" now contains 18281 row versions in 10232 pages
0 index row versions were removed.
0 index pages were newly deleted.
56 index pages are currently deleted, of which 833 are currently reusable.
CPU: user: 0.01 s, system: 0.00 s, elapsed: 0.01 s.
index "oban_jobs_meta_index" now contains 18281 row versions in 9698 pages
0 index row versions were removed.
0 index pages were newly deleted.
35 index pages are currently deleted, of which 621 are currently reusable.
CPU: user: 0.01 s, system: 0.00 s, elapsed: 0.01 s.
table "oban_jobs": found 855 removable, 9661 nonremovable row versions in 29341 out of 3727204 pages
1330 dead row versions cannot be removed yet, oldest xmin: 1378705314
Skipped 0 pages due to buffer pins, 3696951 frozen pages.
912 skipped pages using mintxid fork.
CPU: user: 0.12 s, system: 0.08 s, elapsed: 0.22 s.
vacuuming "pg_toast.pg_toast_72454950"
table "pg_toast_72454950": found 0 removable, 0 nonremovable row versions in 0 out of 0 pages
0 dead row versions cannot be removed yet, oldest xmin: 1378705314
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 skipped pages using mintxid fork.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
analyzing "oban.oban_jobs"
"oban_jobs": scanned 30000 of 3727204 pages, containing 75 live rows and 10501 dead rows; 75 rows in sample, 9318 estimated total rows
VACUUM
> SELECT * FROM (SELECT dead_tuple_count from pgstattuple_approx('oban.oban_jobs'))a, (SELECT n_dead_tup,last_autovacuum,last_analyze,now(),autovacuum_c
ount FROM pg_stat_user_tables WHERE relname = 'oban_jobs' and schemaname = 'oban')b;
-[ RECORD 1 ]-------------------------
dead_tuple_count | 1701
n_dead_tup | 1306009
last_autovacuum | 2024-08-23 16:01:31.034229+00
last_analyze | 2024-08-23 16:01:47.85574+00
now | 2024-08-23 16:01:55.734589+00
autovacuum_count | 446479
This is a Google Alloy DB instance running:
> select version();
-[ RECORD 1 ]-------------------------
version | PostgreSQL 14.10 on x86_64-pc-linux-gnu, compiled by Debian clang version 12.0.1, 64-bit
SELECT 1