On Fri, Jul 5, 2013 at 2:42 AM, Ryan Mitchell wrote: > You aren't starting rgmanager with the -N option are you? It is not the > default. > # man clurgmgrd > -N Do not perform stop-before-start. Combined with the -Z > flag to clusvcadm, this can be used to allow rgmanager to be upgraded > without stopping a given user service or set of services. > > What is supposed to happen is: > - clvmd is started at boot time, and all clustered logical volumes are > activated (including CLVM HA-LVM volumes) > - rgmanager starts after clvmd, and it initializes all resources to ensure > they are in a known state. For example: > Jul 4 20:06:26 r6ha1 rgmanager[2478]: I am node #1 > Jul 4 20:06:27 r6ha1 rgmanager[2478]: Resource Group Manager Starting > Jul 4 20:06:27 r6ha1 rgmanager[2478]: Loading Service Data > Jul 4 20:06:33 r6ha1 rgmanager[2478]: Initializing Services > <---- > Jul 4 20:06:33 r6ha1 rgmanager[3316]: [fs] stop: Could not match > /dev/vgdata/lvmirror with a real device > Jul 4 20:06:33 r6ha1 rgmanager[2478]: stop on fs "fsdata" returned 2 > (invalid argument(s)) > Jul 4 20:06:35 r6ha1 rgmanager[2478]: Services Initialized > Jul 4 20:06:35 r6ha1 rgmanager[2478]: State change: Local UP > Jul 4 20:06:35 r6ha1 rgmanager[2478]: State change: r6ha2.cluster.net UP > - So when rgmanager starts, it stops the CLVM HA-LVM logical volumes again > prior to starting the service, unless you disabled the "stop-before-start" > option. > > I did a quick test and I got the same results as you. Can you show your > resource/service definitions and the logs of when rgmanager starts up? > > > If you open a case with Red Hat, it may find its way to me and we can > troubleshoot further. Thanks for the answer Ryan. I opened the case 00900301 as suggested. I think the problem is with the clvmd already activating lvs. My service is composed by ip resource and some <lv..> and <fs...> resources When the nodes start up, on the node chosen by priority definition of failover domain I get this: Jul 4 14:27:46 oraugov4 rgmanager[6469]: Services Initialized Jul 4 14:27:46 oraugov4 rgmanager[6469]: State change: Local UP Jul 4 14:27:46 oraugov4 rgmanager[6469]: Starting stopped service service:MYSERVICE Jul 4 14:27:48 oraugov4 rgmanager[9436]: [lvm] Failed to activate logical volume, VG_UGDMPRO_TEMP/LV_UGDMPRO_TEMP Jul 4 14:27:48 oraugov4 rgmanager[9458]: [lvm] Attempting cleanup of VG_UGDMPRO_TEMP/LV_UGDMPRO_TEMP Jul 4 14:27:49 oraugov4 rgmanager[9484]: [lvm] Failed second attempt to activate VG_UGDMPRO_TEMP/LV_UGDMPRO_TEMP Jul 4 14:27:49 oraugov4 rgmanager[6469]: start on lvm "LV_UGDMPRO_TEMP" returned 1 (generic error) Jul 4 14:27:49 oraugov4 rgmanager[6469]: #68: Failed to start service:MYSERVICE; return value: 1 Jul 4 14:27:49 oraugov4 rgmanager[6469]: Stopping service service:MYSERVICE Jul 4 14:27:49 oraugov4 rgmanager[9557]: [fs] stop: Could not match /dev/VG_PROVA/lv_prova with a real device Jul 4 14:27:49 oraugov4 rgmanager[6469]: stop on fs "PROVA" returned 2 (invalid argument(s)) Jul 4 14:27:49 oraugov4 rgmanager[9594]: [fs] stop: Could not match /dev/VG_UGDMPRE_RDOF/LV_UGDMPRE_RDOF with a real device Jul 4 14:27:49 oraugov4 rgmanager[6469]: stop on fs "UGDMPRE_RDOF" returned 2 (invalid argument(s)) Jul 4 14:27:49 oraugov4 rgmanager[9631]: [fs] stop: Could not match /dev/VG_UGDMPRE_REDO/LV_UGDMPRE_REDO with a real device Jul 4 14:27:49 oraugov4 rgmanager[6469]: stop on fs "UGDMPRE_REDO" returned 2 (invalid argument(s)) Jul 4 14:27:49 oraugov4 rgmanager[9669]: [fs] stop: Could not match /dev/VG_UGDMPRE_DATA/LV_UGDMPRE_DATA with a real device Jul 4 14:27:49 oraugov4 rgmanager[6469]: stop on fs "UGDMPRE_DATA" returned 2 (invalid argument(s)) Jul 4 14:27:50 oraugov4 rgmanager[9706]: [fs] stop: Could not match /dev/VG_UGDMPRE_SAVE/LV_UGDMPRE_SAVE with a real device Jul 4 14:27:50 oraugov4 rgmanager[6469]: stop on fs "UGDMPRE_SAVE" returned 2 (invalid argument(s)) Jul 4 14:27:50 oraugov4 rgmanager[9743]: [fs] stop: Could not match /dev/VG_UGDMPRE_CTRL/LV_UGDMPRE_CTRL with a real device Jul 4 14:27:50 oraugov4 rgmanager[6469]: stop on fs "UGDMPRE_CTRL" returned 2 (invalid argument(s)) Jul 4 14:27:50 oraugov4 rgmanager[9780]: [fs] stop: Could not match /dev/VG_UGDMPRE_TEMP/LV_UGDMPRE_TEMP with a real device Jul 4 14:27:50 oraugov4 rgmanager[6469]: stop on fs "UGDMPRE_TEMP" returned 2 (invalid argument(s)) Jul 4 14:27:50 oraugov4 rgmanager[9817]: [fs] stop: Could not match /dev/VG_UGDMPRO_RDOF/LV_UGDMPRO_RDOF with a real device Jul 4 14:27:50 oraugov4 rgmanager[6469]: stop on fs "UGDMPRO_RDOF" returned 2 (invalid argument(s)) Jul 4 14:27:50 oraugov4 rgmanager[9854]: [fs] stop: Could not match /dev/VG_UGDMPRO_REDO/LV_UGDMPRO_REDO with a real device Jul 4 14:27:50 oraugov4 rgmanager[6469]: stop on fs "UGDMPRO_REDO" returned 2 (invalid argument(s)) Jul 4 14:27:50 oraugov4 rgmanager[9891]: [fs] stop: Could not match /dev/VG_UGDMPRO_DATA/LV_UGDMPRO_DATA with a real device Jul 4 14:27:50 oraugov4 rgmanager[6469]: stop on fs "UGDMPRO_DATA" returned 2 (invalid argument(s)) Jul 4 14:27:50 oraugov4 rgmanager[9928]: [fs] stop: Could not match /dev/VG_UGDMPRO_SAVE/LV_UGDMPRO_SAVE with a real device Jul 4 14:27:50 oraugov4 rgmanager[6469]: stop on fs "UGDMPRO_SAVE" returned 2 (invalid argument(s)) Jul 4 14:27:50 oraugov4 rgmanager[9965]: [fs] stop: Could not match /dev/VG_UGDMPRO_CTRL/LV_UGDMPRO_CTRL with a real device Jul 4 14:27:50 oraugov4 rgmanager[6469]: stop on fs "UGDMPRO_CTRL" returned 2 (invalid argument(s)) Jul 4 14:27:50 oraugov4 rgmanager[10002]: [fs] stop: Could not match /dev/VG_UGDMPRO_TEMP/LV_UGDMPRO_TEMP with a real device Jul 4 14:27:50 oraugov4 rgmanager[6469]: stop on fs "UGDMPRO_TEMP" returned 2 (invalid argument(s)) Jul 4 14:27:53 oraugov4 rgmanager[6469]: State change: icloraugov3 UP Jul 4 14:28:11 oraugov4 rgmanager[6469]: #12: RG service:MYSERVICE failed to stop; intervention required So I think I have double problem: 1) lv fails to activate because already active 2) then to solve the problem it tries to stop resources but fs.sh fails because it seems there is no related lv under it I think during the stop it should reverse order, so it should stop fs first (and it should get a result of already stopped) and only after it should deactivate the related lv... or not? Gianluca -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster