Search Postgresql Archives

Re: pg_upgrade and wraparound

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



>>  From the query below I am going to say the above query was done on the bof database. Is that correct?

Yes, it is.

>>  Can you run the table_name query in template0 in the 9.6 cluster?

At first I couldn't. There was an error:

psql: FATAL:  database "template0" is not currently accepting connections

I've googled for it and found out that template0 is some special system database that is protected from connections, but that can be changed (see https://wiki.postgresql.org/wiki/Adventures_in_PostgreSQL,_Episode_1)
I've changed that and connected to this database and ran your query. The result is attached in 96-query1-template0.txt.

2018-06-11 22:10 GMT+03:00 Adrian Klaver <adrian.klaver@xxxxxxxxxxx>:
On 06/11/2018 11:32 AM, Alexander Shutyaev wrote:
I'm back with more details.

First, I've deleted the smaller sslentry database, since I don't need it, just so that it doesn't somehow spoil the picture. Now there is only 1 user database - bof (OID=16400). After that I've ran the pg_upgrade on a clean 10.4 cluster and it failed in the same way.

Now, the answers to your queries.

 >> 2) The upgrade stops because of transaction ID wraparound, which is strange as that is not showing up in the 9.6 cluster I presume. You might want the queries found below on the 9.6 and 10 clusters to help figure this out:

 >> SELECT c.oid::regclass as table_name,
 >>        greatest(age(c.relfrozenxid),age(t.relfrozenxid)) as age
 >> FROM pg_class c
 >> LEFT JOIN pg_class t ON c.reltoastrelid = t.oid
 >> WHERE c.relkind IN ('r', 'm');

9.6 result - see attached 96-query1.txt
10.4 result - see attached 104-query1.txt

I should have been clearer in my previous post, the above query is per database. From the query below I am going to say the above query was done on the bof database. Is that correct?

Given the below from 96-query2.txt:

template0 | 110588398

Can you run the table_name query in template0 in the 9.6 cluster?



 >> SELECT datname, age(datfrozenxid) FROM pg_database;

9.6 result - see attached 96-query2.txt
10.4 result - see attached 104-query2.txt

 >>  It might be useful
 >> to see pg_controldata output for the old cluster, as well as
 >> "select datname, datfrozenxid from pg_database" output from the
 >> old cluster.

for the query - see above, for pg_controldata:

9.6 - see attached 96-pg_controldata.txt
10.4 - see attached 104-pg_controldata.txt

 >> Alexander, could you hack things up so autovacuum logging is enabled
 >> (log_autovacuum_min_duration=0), and see whether it's triggered?

I'll be happy to, but that will require to run pg_upgrade once more and that takes more that half a day and during this time clusters are not available to me. Given the data I'm attaching it may happen that the colleagues will want to see something else from my clusters or maybe change some settings before running the pg_upgrade again. Therefore, I'll wait 12 hours after this message in case there will be any more requests and the I'll run the pg_upgrade again.

Thank you all for trying to solve this matter, this is much appreciated! :)

2018-06-11 20:29 GMT+03:00 Andres Freund <andres@xxxxxxxxxxx <mailto:andres@xxxxxxxxxxx>>:

    On 2018-06-11 13:14:12 -0400, Tom Lane wrote:
    > Andres Freund <andres@xxxxxxxxxxx <mailto:andres@xxxxxxxxxxx>> writes:
    > > I suspect the issue is that pg_resetwal does:
    > >     if (set_xid != 0)
    > >     {
    > >             ControlFile.checkPointCopy.nextXid = set_xid;
    >     > >             /*
    > >              * For the moment, just set oldestXid to a value that will force
    > >              * immediate autovacuum-for-wraparound.  It's not clear whether adding
    > >              * user control of this is useful, so let's just do something that's
    > >              * reasonably safe.  The magic constant here corresponds to the
    > >              * maximum allowed value of autovacuum_freeze_max_age.
    > >              */
    > >             ControlFile.checkPointCopy.oldestXid = set_xid - 2000000000;
    > >             if (ControlFile.checkPointCopy.oldestXid < FirstNormalTransactionId)
    > >                     ControlFile.checkPointCopy.oldestXid += FirstNormalTransactionId;
    > >             ControlFile.checkPointCopy.oldestXidDB = InvalidOid;
    > >     }
    >     > > but we have codepath that doesn't check for oldestXidDB being
    > > InvalidOid.  Not great.
    >     > Hm, I think I'd define the problem as "pg_resetwal is violating the
    > expectation that oldestXidDB be valid".

    Well, what could it do otherwise?  ForceTransactionIdLimitUpdate()
    currently does a syscache check for database existence. That'll just
    return a lookup failure for InvalidOid, so we're reasonably good on that
    front.

    Using a hardcoded 2000000000 seems worse, will have funny results if
    running with a smaller autovacuum_freeze_max_age...


    > However, this just explains the basically-cosmetic issue that the
    > complaint message mentions OID 0.  It doesn't really get us to the
    > answer to why Alexander is seeing a failure.  It might be useful
    > to see pg_controldata output for the old cluster, as well as
    > "select datname, datfrozenxid from pg_database" output from the
    > old cluster.

    pg_upgrade starts the server with autovacuum disabled, I suspect
    restoring all the large objects ends up using a lot of transaction
    ids. GetNewTransactionId() should start autovacuum, but I'd guess that's
    where things are going wrong for some reason.

    Alexander, could you hack things up so autovacuum logging is enabled
    (log_autovacuum_min_duration=0), and see whether it's triggered?

    I'm not entirely clear why pg_restore appears to use a separate
    transaction for each large object, surely exascerbating the problem.

    Greetings,

    Andres Freund




--
Adrian Klaver
adrian.klaver@xxxxxxxxxxx

                 table_name                 |    age
--------------------------------------------+-----------
 pg_statistic                               | 110588399
 pg_type                                    | 110588399
 pg_authid                                  | 110588399
 pg_replication_origin                      | 110588399
 pg_default_acl                             | 110588399
 pg_init_privs                              | 110588399
 pg_seclabel                                | 110588399
 pg_shseclabel                              | 110588399
 pg_collation                               | 110588399
 pg_range                                   | 110588399
 pg_transform                               | 110588399
 information_schema.sql_features            | 110588399
 information_schema.sql_languages           | 110588399
 information_schema.sql_packages            | 110588399
 pg_amop                                    | 110588399
 pg_amproc                                  | 110588399
 pg_language                                | 110588399
 pg_largeobject_metadata                    | 110588399
 pg_aggregate                               | 110588399
 pg_rewrite                                 | 110588399
 pg_largeobject                             | 110588399
 pg_trigger                                 | 110588399
 pg_event_trigger                           | 110588399
 pg_description                             | 110588399
 pg_cast                                    | 110588399
 pg_enum                                    | 110588399
 pg_namespace                               | 110588399
 pg_conversion                              | 110588399
 pg_depend                                  | 110588399
 pg_database                                | 110588399
 information_schema.sql_implementation_info | 110588399
 pg_db_role_setting                         | 110588399
 pg_tablespace                              | 110588399
 pg_pltemplate                              | 110588399
 pg_auth_members                            | 110588399
 pg_shdepend                                |  33255025
 pg_shdescription                           | 110588399
 pg_ts_config                               | 110588399
 pg_ts_config_map                           | 110588399
 pg_ts_dict                                 | 110588399
 pg_ts_parser                               | 110588399
 pg_ts_template                             | 110588399
 pg_extension                               | 110588399
 pg_foreign_data_wrapper                    | 110588399
 pg_foreign_server                          | 110588399
 pg_foreign_table                           | 110588399
 pg_policy                                  | 110588399
 information_schema.sql_parts               | 110588399
 information_schema.sql_sizing              | 110588399
 information_schema.sql_sizing_profiles     | 110588399
 pg_user_mapping                            | 110588399
 pg_attribute                               | 110588399
 pg_proc                                    | 110588399
 pg_class                                   | 110588399
 pg_attrdef                                 | 110588399
 pg_constraint                              | 110588399
 pg_inherits                                | 110588399
 pg_index                                   | 110588399
 pg_operator                                | 110588399
 pg_opfamily                                | 110588399
 pg_opclass                                 | 110588399
 pg_am                                      | 110588399
(62 rows)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux