Pacemaker dynamic membership

Nikolay Popov <n.popov@xxxxxxxxxxxxxx> · Wed, 7 Oct 2015 10:44:10 +0300



    Hello.

      
      We was looking the ways to utilize Corosync/Pacemaker stack for
      creating a 
    high-availability

      cluster of PostgreSQL servers with automatic failover.
    

    We are using
      Corosync (2.3.4) as a messaging layer and a stateful master/slave 
    Resource Agent
      (pgsql) with Pacemaker (1.1.12) on CentOS 7.1.

      
    Things work
      pretty well for a static cluster - where membership is defined up
      front. 
    However, we
      needed to be able to seamlessly add new machines (node) to the
      cluster and remove 
    existing ones
      from it, without service interruption. And we ran into a problem.

      
      Is it possible to add a new node dynamically without interruption?

      
      Here are the steps we are using to add a node:

      
      # pcs property show

          Cluster Properties:

           cluster-infrastructure: corosync

           cluster-name: mycluster1

           dc-version: 1.1.13-a14efad

           have-watchdog: false

           last-lrm-refresh: 1444042099

           no-quorum-policy: stop

           stonith-action: reboot

           stonith-enabled: true

          Node Attributes:

           pi01: pgsql-data-status=STREAMING|SYNC

           pi02: pgsql-data-status=STREAMING|POTENTIAL

           pi03: pgsql-data-status=LATEST

          
      # pcs resource show --full
       Group: master-group

          
        Resource: vip-master
            (class=ocf provider=heartbeat type=IPaddr2)
         Attributes:
            ip=192.168.242.100 nic=eth0 cidr_netmask=24
         Operations: start
            interval=0s timeout=60s on-fail=restart
            (vip-master-start-interval-0s)
                     monitor
            interval=10s timeout=60s on-fail=restart
            (vip-master-monitor-interval-10s)
                     stop
            interval=0s timeout=60s on-fail=block
            (vip-master-stop-interval-0s)
        Resource: vip-rep
            (class=ocf provider=heartbeat type=IPaddr2)
         Attributes:
            ip=192.168.242.101 nic=eth0 cidr_netmask=24
         Meta Attrs:
            migration-threshold=0
         Operations: start
            interval=0s timeout=60s on-fail=stop
            (vip-rep-start-interval-0s)
                     monitor
            interval=10s timeout=60s on-fail=restart
            (vip-rep-monitor-interval-10s)
                     stop
            interval=0s timeout=60s on-fail=ignore
            (vip-rep-stop-interval-0s)
       Master: msPostgresql
        Meta Attrs: master-max=1
            master-node-max=1 clone-max=3 clone-node-max=1 notify=true
        Resource: pgsql (class=ocf
            provider=heartbeat type=pgsql)
         Attributes:
            pgctl=/usr/pgsql-9.5/bin/pg_ctl psql=/usr/pgsql-9.5/bin/psql
            pgdata=/var/lib/pgsql/9.5/data/ rep_mode=sync
            node_list="pi01 pi02 pi03" restore_command="cp
            /var/lib/pgsql/9.5/data/wal_archive/%f %p"
            primary_conninfo_opt="user=repl password=super-pass-for-repl
            keepalives_idle=60 keepalives_interval=5 keepalives_count=5"
            master_ip=192.168.242.100 restart_on_promote=true
            check_wal_receiver=true
         Operations: start
            interval=0s timeout=60s on-fail=restart
            (pgsql-start-interval-0s)
                     monitor
            interval=4s timeout=60s on-fail=restart
            (pgsql-monitor-interval-4s)
                     monitor
            role=Master timeout=60s on-fail=restart interval=3s
            (pgsql-monitor-interval-3s-role-Master)
                     promote
            interval=0s timeout=60s on-fail=restart
            (pgsql-promote-interval-0s)
                     demote
            interval=0s timeout=60s on-fail=stop
            (pgsql-demote-interval-0s)
                     stop
            interval=0s timeout=60s on-fail=block
            (pgsql-stop-interval-0s)
                     notify
            interval=0s timeout=60s (pgsql-notify-interval-0s)
      

          # pcs cluster auth pi01 pi02 pi03 pi05 -u hacluster -p
          hacluster
      pi01: Authorized
      pi02: Authorized
      pi03: Authorized
      pi05: Authorized
      

      # pcs cluster node add pi05
            --start
      pi01: Corosync updated
      pi02: Corosync updated
      pi03: Corosync updated
      pi05: Succeeded
      pi05: Starting Cluster...
    
    
      # crm_mon -Afr1              
                                                                    
      Last updated: Fri Oct  2
            16:59:54 2015          Last change: Fri Oct  2 16:59:23 2015
            by hacluster via crmd on pi02

          
      Stack: corosync
      Current DC: pi02 (version
            1.1.13-a14efad) - partition with quorum
      4 nodes and 8 resources
            configured
      

      Online: [ pi01 pi02 pi03 pi05
            ]
      

      Full list of resources:
      

       Resource Group: master-group
           vip-master
            (ocf::heartbeat:IPaddr2):       Started pi02
           vip-rep  
             (ocf::heartbeat:IPaddr2):       Started pi02
       Master/Slave Set:
            msPostgresql [pgsql]
           Masters: [ pi02 ]
           Slaves: [ pi01 pi03 ]
       fence-pi01    
            (stonith:fence_ssh):    Started pi02
       fence-pi02    
            (stonith:fence_ssh):    Started pi01
       fence-pi03    
            (stonith:fence_ssh):    Started pi01
      

      Node Attributes:
      * Node pi01:
          + master-pgsql          
                       : 100
          + pgsql-data-status      
                      : STREAMING|SYNC
          + pgsql-receiver-status  
                      : normal
          + pgsql-status          
                       : HS:sync
      * Node pi02:
          + master-pgsql          
                       : 1000
          + pgsql-data-status      
                      : LATEST
          + pgsql-master-baseline  
                      : 0000000008000098
          + pgsql-receiver-status  
                      : ERROR
          + pgsql-status          
                       : PRI
      * Node pi03:
          + master-pgsql          
                       : -INFINITY
          + pgsql-data-status      
                      : STREAMING|POTENTIAL
          + pgsql-receiver-status  
                      : normal
          + pgsql-status          
                       : HS:potential
      * Node pi05:
      

      Migration Summary:
      * Node pi01:
      * Node pi03:
      * Node pi02:
      * Node pi05:
      

      # pcs resource update msPostgresql
          pgsql master-max=1 master-node-max=1 clone-max=4
          clone-node-max=1 notify=true
      

      # crm_mon -Afr1              
                                            
      Last updated: Fri Oct  2
            17:04:36 2015          Last change: Fri Oct  2 17:04:07 2015
            by root via

          
       cibadmin on pi01
      Stack: corosync
      Current DC: pi02 (version
            1.1.13-a14efad) - partition with quorum
      4 nodes and 9 resources
            configured
      

      Online: [ pi01 pi02 pi03 pi05
            ]
      

      Full list of resources:
      

       Resource Group: master-group
           vip-master
            (ocf::heartbeat:IPaddr2):       Started pi02
           vip-rep  
             (ocf::heartbeat:IPaddr2):       Started pi02
       Master/Slave Set:
            msPostgresql [pgsql]
           Masters: [ pi02 ]
           Slaves: [ pi01 pi03 ]
           Stopped: [ pi05 ]
       fence-pi01    
            (stonith:fence_ssh):    Started pi02
       fence-pi02    
            (stonith:fence_ssh):    Started pi01
       fence-pi03    
            (stonith:fence_ssh):    Started pi01
      

      Node Attributes:
      * Node pi01:
          + master-pgsql          
                       : 100
          + pgsql-data-status      
                      : STREAMING|SYNC
          + pgsql-receiver-status  
                      : normal
          + pgsql-status          
                       : HS:sync
      * Node pi02:
          + master-pgsql          
                       : 1000
          + pgsql-data-status      
                      : LATEST
          + pgsql-master-baseline  
                      : 0000000008000098
          + pgsql-receiver-status  
                      : ERROR
          + pgsql-status          
                       : PRI
      * Node pi03:
          + master-pgsql          
                       : -INFINITY
          + pgsql-data-status      
                      : STREAMING|POTENTIAL
          + pgsql-receiver-status  
                      : normal
          + pgsql-status          
                       : HS:potential
      * Node pi05:
          + master-pgsql          
                       : -INFINITY
          + pgsql-status          
                       : STOP
      

      Migration Summary:
      * Node pi01:
      * Node pi03:
      * Node pi02:
      * Node pi05:
         pgsql:
            migration-threshold=1 fail-count=1000000 last-failure='Fri
            Oct  2 17:04:13 2015'
      

      Failed Actions:
      * pgsql_start_0 on pi05
            'unknown error' (1): call=27, status=complete,
            exitreason='My data may be
       inconsistent. You have to
            remove /var/lib/pgsql/tmp/PGSQL.lock file to force start.',
          last-rc-change='Fri Oct
             2 17:04:10 2015', queued=0ms, exec=2553ms
      

      # pcs resource update pgsql pgsql
          node_list="pi01 pi02 pi03 pi05"
      

            And here we fall into the trouble pgsql-status is now
              STOP!!!!!!!!

            
            # crm_mon -Afr1  

          
      Last updated: Fri Oct  2
            17:07:05 2015          Last change: Fri Oct  2 17:06:37 2015
       by root via cibadmin on pi01
      Stack: corosync
      Current DC: pi02 (version
            1.1.13-a14efad) - partition with quorum
      4 nodes and 9 resources
            configured
      

      Online: [ pi01 pi02 pi03 pi05
            ]
      

      Full list of resources:
      

       Resource Group: master-group
           vip-master
            (ocf::heartbeat:IPaddr2):       Stopped
           vip-rep  
             (ocf::heartbeat:IPaddr2):       Stopped
       Master/Slave Set:
            msPostgresql [pgsql]
           Slaves: [ pi02 ]
           Stopped: [ pi01 pi03
            pi05 ]
       fence-pi01    
            (stonith:fence_ssh):    Started pi02
       fence-pi02    
            (stonith:fence_ssh):    Started pi01
       fence-pi03    
            (stonith:fence_ssh):    Started pi01
      

      Node Attributes:
      * Node pi01:
          + master-pgsql          
                       : -INFINITY
          + pgsql-data-status      
                      : STREAMING|SYNC
          + pgsql-status          
                       : STOP
      * Node pi02:
          + master-pgsql          
                       : -INFINITY
          + pgsql-data-status      
                      : LATEST
          + pgsql-status          
                       : STOP
      * Node pi03:
          + master-pgsql          
                       : -INFINITY
          + pgsql-data-status      
                      : STREAMING|POTENTIAL
          + pgsql-status          
                       : STOP
      * Node pi05:
          + master-pgsql          
                       : -INFINITY
          + pgsql-status          
                       : STOP
      

      Migration Summary:
      * Node pi01:
      * Node pi03:
      * Node pi02:
      * Node pi05:
    
    
      Do you know the way to add new node to cluster without this
      disruption? Maybe some command or something else?

    
    -- 
Nikolay Popov
n.popov@xxxxxxxxxxxxxx
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company