Hello,
you are right it looks to be related, on our running system with connected users such an issue happens not only on primary keys but also on other (non-unique) indexes.
I've checked all indexes using amcheck:
select * from bt_index_check(index => 'prematch.opportunities.pk_tabodds_idodds'::regclass::oid, heapallindexed => true);
Failed one rebuild and check again to ensure all things are OK.
Still have a problem and haven't found any explanation in the manuals:
Run amcheck to validate a table and its indexes:
/usr/pgsql-14/bin/pg_amcheck -d prematch -r opportunities.tab_odds --heapallindexed -j 16 -Pv ; echo $?
...
btree index "prematch.opportunities.pk_tabodds_idodds":
ERROR: heap tuple (2199116,5) from table "tab_odds" lacks matching index tuple within index "pk_tabodds_idodds"
HINT: Retrying verification using the function bt_index_parent_check() might provide a more specific error.
2
ERROR: heap tuple (2199116,5) from table "tab_odds" lacks matching index tuple within index "pk_tabodds_idodds"
HINT: Retrying verification using the function bt_index_parent_check() might provide a more specific error.
2
OK, rebuild the index (first give it a try to use the concurrently option):
prematch=# REINDEX INDEX CONCURRENTLY prematch.opportunities.pk_tabodds_idodds;
REINDEX
prematch=#
prematch=# select * from bt_index_check(index => 'prematch.opportunities.pk_tabodds_idodds'::regclass::oid, heapallindexed => true);
bt_index_check
----------------
(1 row)
REINDEX
prematch=#
prematch=# select * from bt_index_check(index => 'prematch.opportunities.pk_tabodds_idodds'::regclass::oid, heapallindexed => true);
bt_index_check
----------------
(1 row)
Looks we were lucky.
Just to be sure, one more check from the command line:
/usr/pgsql-14/bin/pg_amcheck -d prematch -r opportunities.tab_odds --heapallindexed -j 16 -P ; echo $?
0/15 relations (0%), 0/16807950 pages (0%)
heap table "prematch.opportunities.tab_odds", block 1649057, offset 47:
xmin 4062380236 precedes relation freeze threshold 2:3960858664
heap table "prematch.opportunities.tab_odds", block 1649057, offset 48:
xmax 4062380236 precedes relation freeze threshold 2:3960858664
heap table "prematch.opportunities.tab_odds", block 1649057, offset 47:
xmin 4062380236 precedes relation freeze threshold 2:3960858664
heap table "prematch.opportunities.tab_odds", block 1649057, offset 48:
xmax 4062380236 precedes relation freeze threshold 2:3960858664
...
xmax 4062380813 precedes relation freeze threshold 2:3960858664
heap table "prematch.opportunities.tab_odds", block 2210728, offset 8:
xmax 4062380814 precedes relation freeze threshold 2:3960858664
2
heap table "prematch.opportunities.tab_odds", block 2210728, offset 8:
xmax 4062380814 precedes relation freeze threshold 2:3960858664
2
So I got non-zero exit code, but no errors in the log file:
-bash-4.2$ grep -i ERROR 20220525_1710_prematch.opportunities.tab_odds.log
-bash-4.2$
-bash-4.2$
We have multiple DB servers upgraded to 14, only a few amchecks:
( time /usr/pgsql-14/bin/pg_amcheck -a --heapallindexed -j 16 -P ; echo $? ) |& tee -a `date +%Y%m%d_%H%M`_`hostname`_amcheck.log
finished with zero exit code, most of them returned 2, within the xmin/xmax messages as shown above. Is that an issue (since missing entries in the index cause ERRRO reported in the log, but after rebuilding the indexes it was not the case anymore)?
If it is an issue, is there a way to fix it?
Thanks Ales
út 24. 5. 2022 v 21:58 odesílatel Thomas Munro <thomas.munro@xxxxxxxxx> napsal:
On Wed, May 25, 2022 at 6:17 AM Aleš Zelený <zeleny.ales@xxxxxxxxx> wrote:
> SELECT format('REINDEX SCHEMA CONCURRENTLY %I;', n.nspname)
This may be related to bug #17485, discussed at:
https://www.postgresql.org/message-id/flat/17485-396609c6925b982d%40postgresql.org