On Mon, Jun 17, 2013 at 12:18 PM, Lonni J Friedman <netllama@xxxxxxxxx> wrote:
-- I only have 1 tablespace, although I have 9 databases. However, oneOn Sat, Jun 15, 2013 at 8:03 PM, Bruce Momjian <bruce@xxxxxxxxxx> wrote:
> On Fri, Jun 14, 2013 at 02:29:24PM -0700, Lonni J Friedman wrote:
>> Greetings,
>> I'm in the early stages of preparing to upgrade a production 9.2
>> cluster to 9.3, by testing the beta of 9.3. All of my testing is
>> happening on RHEL6-x86_64 on a dedicated server with 128GB RAM and 2x
>> Intel Xeon E5-2670 CPUs, with all of $PGDATA residing on an 8 disk
>> RAID10 array.
>>
>> Currently, a full pg_basebackup of my data is approaching 800GB in
>> size (uncompressed), so this isn't a tiny, trivial database.
>>
>> I was curious about how much of a performance gain I'd get from
>> upgrading with the new -j option to pg_upgrade, so first I performed
>> the upgrade without it to get a baseline. The command I ran for the
>> upgrade is as follows:
>> time pg_upgrade -v -d /var/lib/pgsql/9.2/data -D
>> /var/lib/pgsql/9.3/data -b /usr/pgsql-9.2/bin -B /usr/pgsql-9.3/bin
>>
>> time reported the following afterward the upgrade had completed successfully:
>> real 24m59.255s
>> user 0m17.069s
>> sys 15m25.153s
>>
>>
>> I then repeated the upgrade (after blowing away $PGDATA, and running
>> initdb again for 9.3), and re-ran pg_upgrade with the same command as
>> above, only with '-j4' appended to the end. Surprisingly, the
>> completion time was less than 30 seconds faster. I repeated a third
>> time with '-j8', and that was about the same completion time as with
>> '-j4'. I guess I could repeat with 'j2', but I'd be surprised if it
>> was dramatically faster when -j4 was only marginally so. It seems
>> like the parallelism of the -j option doesn't seem to be helping much
>> at all, in my case.
>>
>> Is this expected, or is it possible that there's a bug somewhere? Let
>> me know if I can provide any logs from the upgrade.
>
> The documentation states:
>
> The <option>--jobs</> option allows multiple CPU cores to be used
> for copying/linking of files and to dump and reload database schemas
> in parallel; a good place to start is the maximum of the number of
> CPU cores and tablespaces. This option can dramatically reduce the
> time to upgrade a multi-database server running on a multiprocessor
> machine.
>
> My guess is that you didn't have many tablespaces or databases, or the
> copy time overwhelmed the performance improvement of the parallelism.
> I am not surprised you didn't see a big win. Can you test --link
> mode?
of the databases is about 95% of the total on-disk space, so that's
probably the explanation of why -j isn't helping me?
That is probably why you DID had some improvement, although so little.
I don't have sufficient disk space to efficiently test --link mode,
unless there's some way to quickly roll back to the pre-upgrade
version of the database after a --link mode upgrade has completed
successfully that I'm not seeing?
I didn't got it. AFAIK, in link mode it would take **less** space than normal mode, not the opposite. Am I wrong?
Unless you want to keep a backup of the old cluster on the same machine, but even on that case you could take this backup compressed, although the overall time would be worst.
Regards,
Matheus de Oliveira
Analista de Banco de Dados
Dextra Sistemas - MPS.Br nível F!
www.dextra.com.br/postgres