Re: Distributed transactions and asynchronous commit

Vasilis Ventirozos <v.ventirozos@xxxxxxxxx> · Wed, 17 Jul 2013 12:26:51 +0100

On Wed, Jul 17, 2013 at 12:21 PM, Vasilis Ventirozos <v.ventirozos@xxxxxxxxx> wrote:

On Wed, Jul 17, 2013 at 11:52 AM, Xenofon Papadopoulos <xpapad@xxxxxxxxx> wrote:

Thank you for your replies so far.The DB in question is Postgres+ 9.2 running inside a VM with the following specs:

16 CPUs (dedicated to the VM)
60G RAM
RAID-10 storage on a SAN for pgdata and pgarchieves, using different LUNs for each.

We have 3 kind of queries:

- The vast majority of the queries are small SELECT/INSERT/UPDATEs which are part of distributed transactions
- A few small ones, which are mostly SELECTs

- A few bulk loads, where we add 100k - 1M of rows in tables

Our settings are:

shared_buffers: 8G
work_mem: 12M
checkpoint_segments: 64

Autovacuum is somewhat aggressive, as our data changes quite often and without it the planner was completely off.
Right now we use:

 autovacuum_analyze_scale_factor: 0.1 

 autovacuum_analyze_threshold: 50   
 autovacuum_freeze_max_age: 200000000 
 autovacuum_max_workers: 12                 
 autovacuum_naptime: 10s                   
 autovacuum_vacuum_cost_delay: 20ms       

 autovacuum_vacuum_cost_limit: -1              
 autovacuum_vacuum_scale_factor: 0.2         
 autovacuum_vacuum_threshold: 50         

settings look ok, except vacuum and analyze threshold that is in my opinion too agressive (500 would make more sense) and workers at 6 you haven't mentioned wal_buffers and  effective_io_concurrency settings but i dont think that it would make much of a difference

At high-peak hour, the disk utilization for the pgdata mountpoint is:

00:00:01          DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
13:20:01     dev253-2   7711.62  24166.97  56657.95     10.48    735.28     95.09      0.11     86.11

13:30:01     dev253-2   5340.88  19465.30  39133.32     10.97    319.20     59.94      0.15     82.30
13:40:01     dev253-2   2791.02  13061.76  19330.40     11.61    349.95    125.38      0.33     90.73

13:50:01     dev253-2   3478.69  10503.84  25505.27     10.35    308.12     88.57      0.20     68.12
14:00:01     dev253-2   5269.12  33613.43  35830.13     13.18    232.48     44.09      0.19    100.05

14:10:01     dev253-2   4910.24  21767.22  33970.96     11.35    322.52     65.64      0.21    104.55
14:20:02     dev253-2   5358.95  40772.03  33682.46     13.89    721.81    134.32      0.20    104.92

14:30:01     dev253-2   4420.51  17256.16  33315.27     11.44    336.53     76.13      0.15     65.25
14:40:02     dev253-2   4884.13  28439.26  31604.76     12.29    265.32     54.26      0.20     97.51

14:50:01     dev253-2   3124.91   8077.46  22511.59      9.79     50.41     16.13      0.24     76.17

assuming that sector = 512 bytes, it means that your san makes 20mb/sec read which if its not totally random-reads is quite low,

i would start from there, make tests to see if everything works ok, (bonnie++, dd , etc) and if you are getting the numbers you are supposed to

i would also check for index / table bloat, here's a script that it would do that for you

http://labs.omniti.com/labs/pgtreats/browser/trunk/tools/pg_bloat_report.pl

and for pgarchives:

00:00:01          DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
13:20:01     dev253-3   2802.25      0.69  22417.32      8.00    465.05    165.94      0.02      4.32
13:30:01     dev253-3   1559.87  11159.45  12120.99     14.92     64.17     41.11      0.08     12.02

13:40:01     dev253-3    922.62   8066.62   7129.15     16.47     19.75     21.40      0.08      6.99
13:50:01     dev253-3   1194.81    895.34   9524.53      8.72     28.40     23.76      0.01      1.69

14:00:01     dev253-3   1919.12      0.46  15352.49      8.00     51.75     26.95      0.01      1.61
14:10:01     dev253-3   1770.59   9286.61  13873.79     13.08    139.86     78.97      0.08     14.46

14:20:02     dev253-3   1595.04  11810.63  12389.08     15.17    109.17     68.42      0.15     24.71
14:30:01     dev253-3   1793.71  12173.88  13957.79     14.57    141.56     78.89      0.08     13.61

14:40:02     dev253-3   1751.62      0.43  14012.53      8.00     43.38     24.76      0.01      1.40
14:50:01     dev253-3   1351.72   3225.19  10707.29     10.31     31.91     23.59      0.02      2.93

On Wed, Jul 17, 2013 at 1:09 PM, Giuseppe Broccolo <giuseppe.broccolo@xxxxxxxxxxxxxx> wrote:

    Hi,

      Il 17/07/2013 09:18, Xenofon Papadopoulos ha scritto:

        In
              the asynchronous commit documentation, it says:

        The
            commands supporting two-phase commit, such as PREPARE
            TRANSACTION,
            are also always synchronous

        Does
              this mean that all queries that are part of a distributed
              transaction are synchronous?

        In
              our databases we have extremely high disk I/O, I'm
              wondering if distributed transactions may be the reason
              behind it.

    Distributed transactions are base on two-phase-commit (2PC)
    algorithms for ensuring correct transaction completion,  so are
    synchronous. 

    However, I think this is not the main reason behind your extremely
    high disk I/O. You should check if your system is properly tuned to
    get the best performances. 

    First of all, you could take a look on your PostgreSQL
    configurations, and check if shared_memory is set properly taking
    into account your RAM availability. The conservative PostgreSQL
    default value is 24 MB, forcing system to exploit many disk I/O
    resources.

    Aside from this, you could take a look if autovacuum is often
    triggered (generating a large amount of I/O) in case of large use of
    updates/inserts in your database.

    Regards,

    Giuseppe.

    -- 
Giuseppe Broccolo - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
giuseppe.broccolo@xxxxxxxxxxxxxx | www.2ndQuadrant.it