Re: pg_upgrade + streaming replication ?

Henk Bronk <hb@xxxxxxxxxxxxxx> · Tue, 20 Mar 2012 22:21:29 +0100

actually rsync works fine on file level and is good for manual syncing.
it check really the files with the stat command, so a bit change will trigger the copy
in practice you need to keep an eye on compleetness of the rsync action.

try to use it without compression for large data sets, it saves time strangely.

Met vriendelijk groet, with kind regards

Henk Bronk 

On 20 mrt. 2012, at 21:49, Bruce Momjian <bruce@xxxxxxxxxx> wrote:

> On Tue, Mar 20, 2012 at 02:58:20PM -0400, Bruce Momjian wrote:
>> On Tue, Mar 20, 2012 at 11:56:29AM -0700, Lonni J Friedman wrote:
>>>>> So how can you resume streaming without rebuilding the slaves?
>>>> 
>>>> Oh, wow, I never thought of the fact that the system tables will be
>>>> different?   I guess you could assume the pg_dump restore is going to
>>>> create things exactly the same on all the systems, but I never tested
>>>> that.  Do the system id's have to match?  That would be a problem
>>>> because you are initdb'ing on each server.  OK, crazy idea, but I
>>>> wonder if you could initdb on the master, then copy that to the slaves,
>>>> then run pg_upgrade on each of them.  Obviously this needs some testing.
>>> 
>>> Wouldn't it be easier to just pg_upgrade the master, then setup the
>>> slaves from scratch (with rsync, etc)?  It certainly wouldn't be any
>>> more work to do it that way (although still a lot more work than
>>> simply running pg_upgrade on all servers).
>> 
>> Hey, wow, that is an excellent idea because rsync is going to realize
>> that all the user-data files are exactly the same and skip them --- that
>> is the winner solution.  I should probably add this to the pg_upgrade
>> documentaiton.  Thanks.
> 
> Actually, I am not sure how well rsync will work, because by default it
> only skips files with matching file timestamp and size, and I bet many
> of the file will have different times because of streaming replication
> lag, and server time lag.  I think we need this rsync options:
> 
>       -c, --checksum
>              This changes the way rsync checks if the files have been
>              changed and are in need of a transfer.  Without this option,
>              rsync uses a "quick check" that (by default) checks if each
>              file's size and  time  of  last  modification  match
>              between  the  sender  and receiver.  This option changes
>              this to compare a 128-bit checksum for each file that has
>              a matching size.  Generating the check sums means that
>              both sides will expend a lot of disk I/O reading all the
>              data in the files in the transfer (and this is prior to
>              any reading that will be done to transfer changed files),
>              so this can slow things down significantly.
> 
>              The  sending  side  generates  its checksums while it is
>              doing the file-system scan that builds the list of the
>              available files.  The receiver generates its checksums when
>              it is scanning for changed files, and will checksum any file
>              that has the same size as the corresponding sender's file:
>              files with either a changed size or a changed checksum are
>              selected for transfer.
> 
> and I suspect that will be slow.  Probably better than nothing, but not
> super-fast either.
> 
> -- 
>  Bruce Momjian  <bruce@xxxxxxxxxx>        http://momjian.us
>  EnterpriseDB                             http://enterprisedb.com
> 
>  + It's impossible for everything to be true. +
> 
> -- 
> Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general