Clustering service PostgreSQL

ESGLinux <esggrupos@xxxxxxxxx> · Wed, 15 Jul 2009 09:35:16 +0200

Hi all, 
Im trying to cluster the PostgreSQL database in active-passive mode with two nodes. 

I have tested it and If I fence one node, it fails over the other node, but when I run the command to manually relocate the service:

clusvcadm -r BBDD -m NODO1 

when I have open connections open on the database I get this  error messages on /var/log/messages:

Jul 15 09:27:29 NODO2 clurgmgrd[2493]: <notice> Stopping service service:BBDD
Jul 15 09:27:33 NODO2 clurgmgrd: [2493]: <err> Stopping Service postgres-8:BBDD > Failed - Application Is Still Running
Jul 15 09:27:33 NODO2 clurgmgrd: [2493]: <err> Stopping Service postgres-8:BBDD > Failed
Jul 15 09:27:33 NODO2 clurgmgrd[2493]: <notice> stop on postgres-8 "BBDD" returned 1 (generic error)
Jul 15 09:27:33 NODO2 clurgmgrd: [2493]: <err> Stopping Service postgres-8:BBDD > Failed
Jul 15 09:27:33 NODO2 clurgmgrd[2493]: <notice> stop on postgres-8 "BBDD" returned 1 (generic error)
Jul 15 09:27:33 NODO2 avahi-daemon[2304]: Withdrawing address record for 192.168.1.183 on eth0.
Jul 15 09:27:43 NODO2 clurgmgrd[2493]: <crit> #12: RG service:BBDD failed to stop; intervention required
Jul 15 09:27:43 NODO2 clurgmgrd[2493]: <notice> Service service:BBDD is failed

Jul 15 09:27:43 NODO2 clurgmgrd[2493]: <warning> #70: Failed to relocate service:BBDD; restarting locally
Jul 15 09:27:43 NODO2 clurgmgrd[2493]: <err> #43: Service service:BBDD has failed; can not start.
Jul 15 09:27:44 NODO2 clurgmgrd[2493]: <alert> #2: Service service:BBDD returned failure code.  Last Owner: NODO2.
Jul 15 09:27:44 NODO2 clurgmgrd[2493]: <alert> #4: Administrator intervention required.

If I check the status of the service: 
Service Name                                                     Owner (Last)                                                     State
 ------- ----                                                     ----- ------                                                     -----
 service:BBDD                                                     (NODO2)                                           failed

and If I check with ps:

# ps aux | grep postgres
root     21552  0.0  0.2   2844  1120 ?        S<   09:27   0:00 su - postgres -c /usr/bin/postmaster -c config_file="/etc/cluster/postgres-8/postgres-8:BBDD/postgresql.conf" ??-D /nfsvol/pgsql/data
postgres 21553  0.1  0.5  21504  3076 ?        S<s  09:27   0:00 /usr/bin/postmaster -c config_file=/etc/cluster/postgres-8/postgres-8:BBDD/postgresql.conf -D /nfsvol/pgsql/data
postgres 21591  0.0  0.1  11284   608 ?        S<   09:27   0:00 postgres: logger process                                                                    
postgres 21593  0.0  0.1  21504   896 ?        S<   09:27   0:00 postgres: writer process                                                                    
postgres 21594  0.0  0.1  12284   608 ?        S<   09:27   0:00 postgres: stats buffer process                                                              
postgres 21595  0.0  0.1  11428   804 ?        S<   09:27   0:00 postgres: stats collector process                                                           
postgres 21720  0.0  0.8  22280  4328 ?        S<   09:27   0:00 postgres: postgres postgres 192.168.1.170(2849) idle                                        

So, I think the problem is with the stop script that comes with the cluster suite. I think it must close all the open connections or wait until they finish...

Anyone has this problem? and how can it be solved

Thanks in advance

ESG

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster