This is the cluster conf, Which is a clone of the problematic system on a test environment (without the ORacle and SAP instances, only focusing on this LVM issue, with an LVM resource) [root@rhel2 ~]# cat /etc/cluster/cluster.conf <?xml version="1.0"?> <cluster config_version="7" name="teszt"> <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/> <clusternodes> <clusternode name="rhel1.local" nodeid="1" votes="1"> <fence/> </clusternode> <clusternode name="rhel2.local" nodeid="2" votes="1"> <fence/> </clusternode> </clusternodes> <cman expected_votes="3"/> <fencedevices/> <rm> <failoverdomains> <failoverdomain name="all" nofailback="1" ordered="1" restricted="0"> <failoverdomainnode name="rhel1.local" priority="1"/> <failoverdomainnode name="rhel2.local" priority="2"/> </failoverdomain> </failoverdomains> <resources> <lvm lv_name="teszt-lv" name="teszt-lv" vg_name="teszt"/> <fs device="/dev/teszt/teszt-lv" fsid="43679" fstype="ext4" mountpoint="/lvm" name="teszt-fs"/> </resources> <service autostart="1" domain="all" exclusive="0" name="teszt" recovery="disable"> <lvm ref="teszt-lv"/> <fs ref="teszt-fs"/> </service> </rm> <quorumd label="qdisk"/> </cluster> Here are the log parts: Aug 10 17:21:21 rgmanager I am node #2 Aug 10 17:21:22 rgmanager Resource Group Manager Starting Aug 10 17:21:22 rgmanager Loading Service Data Aug 10 17:21:29 rgmanager Initializing Services Aug 10 17:21:31 rgmanager /dev/dm-2 is not mounted Aug 10 17:21:31 rgmanager Services Initialized Aug 10 17:21:31 rgmanager State change: Local UP Aug 10 17:21:31 rgmanager State change: rhel1.local UP Aug 10 17:23:23 rgmanager Starting stopped service service:teszt Aug 10 17:23:25 rgmanager Failed to activate logical volume, teszt/teszt-lv Aug 10 17:23:25 rgmanager Attempting cleanup of teszt/teszt-lv Aug 10 17:23:29 rgmanager Failed second attempt to activate teszt/teszt-lv Aug 10 17:23:29 rgmanager start on lvm "teszt-lv" returned 1 (generic error) Aug 10 17:23:29 rgmanager #68: Failed to start service:teszt; return value: 1 Aug 10 17:23:29 rgmanager Stopping service service:teszt Aug 10 17:23:30 rgmanager stop: Could not match /dev/teszt/teszt-lv with a real device Aug 10 17:23:30 rgmanager stop on fs "teszt-fs" returned 2 (invalid argument(s)) Aug 10 17:23:31 rgmanager #12: RG service:teszt failed to stop; intervention required Aug 10 17:23:31 rgmanager Service service:teszt is failed Aug 10 17:24:09 rgmanager #43: Service service:teszt has failed; can not start. Aug 10 17:24:09 rgmanager #13: Service service:teszt failed to stop cleanly Aug 10 17:25:12 rgmanager Starting stopped service service:teszt Aug 10 17:25:14 rgmanager Failed to activate logical volume, teszt/teszt-lv Aug 10 17:25:15 rgmanager Attempting cleanup of teszt/teszt-lv Aug 10 17:25:17 rgmanager Failed second attempt to activate teszt/teszt-lv Aug 10 17:25:18 rgmanager start on lvm "teszt-lv" returned 1 (generic error) Aug 10 17:25:18 rgmanager #68: Failed to start service:teszt; return value: 1 Aug 10 17:25:18 rgmanager Stopping service service:teszt Aug 10 17:25:19 rgmanager stop: Could not match /dev/teszt/teszt-lv with a real device Aug 10 17:25:19 rgmanager stop on fs "teszt-fs" returned 2 (invalid argument(s)) After I manually started the lvm on node1 and tried to switch it on node2 it's not able to start it. Regards, Krisztian On 08/10/2012 05:15 PM, Digimer wrote: > On 08/10/2012 11:07 AM, Poós Krisztián wrote: >> Dear all, >> >> I hope that anyone run into this problem in the past, so maybe can help >> me resolving this issue. >> >> There is a 2 node rhel cluster with quorum also. >> There are clustered lvms, where the -c- flag is on. >> If I start clvmd all the clustered lvms became online. >> >> After this if I start rgmanager, it deactivates all the volumes, and not >> able to activate them anymore as there are no such devices anymore >> during the startup of the service, so after this, the service fails. >> All lvs remain without the active flag. >> >> I can manually bring it up, but only if after clvmd is started, I set >> the lvms manually offline by the lvchange -an <lv> >> After this, when I start rgmanager, it can take it online without >> problems. However I think, this action should be done by the rgmanager >> itself. All the logs is full with the next: >> rgmanager Making resilient: lvchange -an .... >> rgmanager lv_exec_resilient failed >> rgmanager lv_activate_resilient stop failed on .... >> >> As well, sometimes the lvs/clvmd commands are also hanging. I have to >> restart clvmd to make it work again. (sometimes killing it) >> >> Anyone has any idea, what to check? >> >> Thanks and regards, >> Krisztian > > Please paste your cluster.conf file with minimal edits. >
Attachment:
smime.p7s
Description: S/MIME Cryptographic Signature
-- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster