Search Postgresql Archives

Re: Scriptable way to validate a pg_dump restore ?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jan 29, 2024 at 3:12 AM Laura Smith <n5d9xq3ti233xiyif2vp@xxxxxxxxxxxxx> wrote:
Hi

Let's say I've got a scenario where I'm doing a pg_dump replication rather than online streaming, e.g. due to air-gap or whatever.

Is there a scriptable way to validate the restore ?  e.g. using doing something clever with ctid or something to ensure both the schema and all its rows were restored to the same point at which the dump was taken ?

1. Since pg_dump creates a logical copy, nothing binary like ctid will work.
2. pg_dump is designed to take a snapshot.If it doesn't, it would fail for people who do logical replication.  However, no one has noticed. 

That's kinda like being asked to prove that rocks always fall when you drop them.  Either you trust physics, because physics has always worked, or you must watch every rock, because next time it might not fall.  The analogy is slightly flawed, since we always check the pg_dump and pg_restore return codes, since something else might impact their function.

But if you still need evidence, here's what I'm doing to verify table and record counts during a 9.6 -> 14 migration.  You'll have to modify it for your purpose.

Create this table and function beforehand:
CREATE TABLE dba.migration_table_counts (
    location text check (location in ('96', '14'))
  , table_name text
  , row_count bigint
  , count_time timestamp without time zone default current_timestamp
  , primary key (table_name, location)
);
CREATE OR REPLACE FUNCTION dba.get_table_counts(_p_source TEXT) RETURNS INTEGER
    LANGUAGE plpgsql
AS $func$
DECLARE
    r RECORD;
    _sql TEXT;
    _table_count BIGINT;
BEGIN
    FOR r IN select relnamespace::regnamespace::text||'.'||relname as table_name
            from pg_class cla
            where relkind = 'r'
            and not exists (select 1   -- excludes parent tables
                            from pg_inherits inh1
                            where inh1.inhparent = cla.oid)
            and relnamespace::regnamespace::text
                    not in ('pg_catalog', 'information_schema', 'dba')
            order by 1
    LOOP
        _sql := FORMAT('SELECT COUNT(*) FROM %s;', r.table_name);
        RAISE NOTICE '% %', to_char(clock_timestamp(), 'YYYY-MM-DD HH24:MI:SS.MS'), _sql;
        EXECUTE _sql INTO _table_count;
        --RAISE NOTICE '%', _table_count;
        INSERT INTO dba.migration_table_counts (location, table_name, row_count)
            VALUES (_p_source, r.table_name, _table_count);
    END LOOP;
    RAISE NOTICE '% %', to_char(clock_timestamp(), 'YYYY-MM-DD HH24:MI:SS.MS'), 'Finished';
    RETURN 0;
END
$func$;

Run this script in a cron job that executes at the same time as your pg_dump cron job.  Parameters should be for the source database. 
#!/bin/bash
declare -gr Server=$1
declare -gr DB=$2
declare -gr Source=$3
if [ -z $Server ]; then echo "Requires a server name."; exit 2; fi
if [ -z $DB ]; then echo "Requires a DB name."; exit 3; fi
if [ -z $Source ]; then echo "Requires a source: 96|14."; exit 4; fi
psql -U postgres -h ${Server} $DB -Xac "DELETE FROM dba.migration_table_counts WHERE location = '$Source';"
psql -U postgres -h ${Server} $DB -Xac "select * from dba.get_table_counts('$Source');"

Run the same script on the destination server after the pg_restore is finished.
Dump the source dba.migration_table_counts then load it into the destination dba.migration_table_counts.

These two queries run on the destination server will check that all tables exist in both databases, and that the record counts are the same.

You'll have some slight variations, since the two jobs are in separate transactions.  (Mine won't, since the applications will be shut down, and pg_hba.conf will block them.)

declare -gr SQL1="
    with
    c96 as (select table_name, row_count from dba.migration_table_counts where location = '96'),
    c14 as (select table_name, row_count from dba.migration_table_counts where location = '14')
    select c96.*, c14.*
    from c96 full join c14 on c96.table_name = c14.table_name
    where c96.table_name is null
       or c14.table_name is null
    order by c96.table_name, c14.table_name;"
psql $DB -ac "$SQL1"

declare -gr SQL2="
    with
    c96 as (select table_name, row_count from dba.migration_table_counts where location = '96'),
    c14 as (select table_name, row_count from dba.migration_table_counts where location = '14')
    select c96.*, c14.*, c96.row_count - c14.row_count as row_diff
    from c96 inner join c14 on c96.table_name = c14.table_name
    where c96.row_count != c14.row_count
    order by c96.table_name;"
psql $DB -ac "$SQL2"


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux