Hello, This thread is long, please pay some patience. I am building active-active Samba across two nodes, nodes(both installed RHEL4.5): -------------------------- kaka1: 192.168.3.52 kaka2: 192.168.3.249 and here's the "/etc/cluster/cluster.conf": --------------------------- <cluster alias="seedorf" config_version="159" name="seedorf"> <fence_daemon post_fail_delay="0" post_join_delay="3"/> <clusternodes> <clusternode name="kaka1" votes="1"> <fence> <method name="1"> <device name="NPS" nodename="kaka1"/> </method> </fence> </clusternode> <clusternode name="kaka2" votes="1"> <fence> <method name="1"> <device name="NPS" nodename="kaka2"/> </method> </fence> </clusternode> </clusternodes> <cman expected_votes="1" two_node="1"/> <fencedevices> <fencedevice agent="fence_manual" name="NPS"/> </fencedevices> <rm> <failoverdomains> <failoverdomain name="failover-1" ordered="1"> <failoverdomainnode name="kaka1" priority="1"/> <failoverdomainnode name="kaka2" priority="2"/> </failoverdomain> <failoverdomain name="failover-2" ordered="1"> <failoverdomainnode name="kaka1" priority="2"/> <failoverdomainnode name="kaka2" priority="1"/> </failoverdomain> </failoverdomains> <resources> <clusterfs device="/dev/milan/mirror" force_unmount="0" fsid="37802" fstype="gfs" mountpoint="/nfsdata" name="phillip_gfs" options="acl"/> <smb name="samba_1" workgroup="samba_test"/> <smb name="samba_2" workgroup="samba_test"/> <script file="/etc/init.d/smb" name="smb_script"/> <ip address="192.168.3.143" monitor_link="1"/> <ip address="192.168.3.150" monitor_link="1"/> </resources> <service autostart="1" domain="failover-1" name="smb-1" recovery="relocate"> <smb ref="samba_1"> <clusterfs ref="phillip_gfs"/> <script ref="smb_script"/> </smb> <ip ref="192.168.3.143"/> </service> <service autostart="1" domain="failover-2" name="smb-2" recovery="relocate"> <smb ref="samba_2"> <clusterfs ref="phillip_gfs"/> <script ref="smb_script"/> </smb> <ip ref="192.168.3.150"/> </service> </rm> </cluster> ------------------------------------------- When these two nodes are both running, there will automatically create /etc/samba/smb.conf.samba_1 in kaka1, and /etc/samba/smb.conf.samba_2 in kaka2: On kaka1: -------------------------- [root@kaka1 samba]# cat smb.conf.samba_1 | grep -v "#" [global] workgroup = samba_test pid directory = /var/run/samba/samba_1 lock directory = /var/cache/samba/samba_1 log file = /var/log/samba/%m.log encrypt passwords = yes bind interfaces only = yes netbios name = samba_1 interfaces = 192.168.3.143 [test] public = yes path = /nfsdata read only = no [root@kaka1 samba]# scp smb.conf.samba_1 kaka2:/etc/samba/ On kaka2: --------------------------- [root@kaka2 samba]# cat smb.conf.samba_2 |grep -v "#" [global] workgroup = samba_test pid directory = /var/run/samba/samba_2 lock directory = /var/cache/samba/samba_2 log file = /var/log/samba/%m.log encrypt passwords = yes bind interfaces only = yes netbios name = samba_2 interfaces = 192.168.3.150 [test2] public = yes path = /nfsdata read only = no [root@kaka2 samba]# scp smb.conf.samba_2 kaka1:/etc/samba/ Now, reboot the nodes and check the cluster status: --------------------------------- [root@kaka2 ~]# clustat Member Status: Quorate Member Name Status ------ ---- ------ kaka1 Online, rgmanager kaka2 Online, Local, rgmanager Service Name Owner (Last) State ------- ---- ----- ------ ----- smb-1 kaka1 started smb-2 kaka2 started and I can see the float IP(s) has been assigned: ---------------------------------- On kaka1: [root@kaka1 ~]# ip addr list 1: lo: <LOOPBACK,UP> mtu 16436 qdisc noqueue link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 brd 127.255.255.255 scope host lo inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 1000 link/ether 00:0c:29:e8:11:a1 brd ff:ff:ff:ff:ff:ff inet 192.168.3.52/24 brd 192.168.3.255 scope global eth0 inet 192.168.3.143/32 scope global eth0 inet6 fe80::20c:29ff:fee8:11a1/64 scope link valid_lft forever preferred_lft forever 3: sit0: <NOARP> mtu 1480 qdisc noop link/sit 0.0.0.0 brd 0.0.0.0 On kaka2: [root@kaka2 ~]# ip addr list 1: lo: <LOOPBACK,UP> mtu 16436 qdisc noqueue link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 brd 127.255.255.255 scope host lo inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 1000 link/ether 00:0c:29:24:0c:72 brd ff:ff:ff:ff:ff:ff inet 192.168.3.249/24 brd 192.168.3.255 scope global eth0 inet 192.168.3.150/32 scope global eth0 inet6 fe80::20c:29ff:fe24:c72/64 scope link valid_lft forever preferred_lft forever 3: sit0: <NOARP> mtu 1480 qdisc noop link/sit 0.0.0.0 brd 0.0.0.0 At this point, poweroff the "kaka1", and kaka1's original float IP(192.168.3.143) would be appended to kaka2: ------------------------------------------------- [root@kaka2 ~]# ip addr list 1: lo: <LOOPBACK,UP> mtu 16436 qdisc noqueue link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 brd 127.255.255.255 scope host lo inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 1000 link/ether 00:0c:29:24:0c:72 brd ff:ff:ff:ff:ff:ff inet 192.168.3.249/24 brd 192.168.3.255 scope global eth0 inet 192.168.3.150/32 scope global eth0 inet 192.168.3.143/32 scope global eth0 inet6 fe80::20c:29ff:fe24:c72/64 scope link valid_lft forever preferred_lft forever 3: sit0: <NOARP> mtu 1480 qdisc noop link/sit 0.0.0.0 brd 0.0.0.0 Hmm, it seems the samba services still keep running well, and the clients accessing "192.168.3.143" do not feel interrupt. --------------------------------------- [root@kaka2 ~]# clustat Member Status: Quorate Member Name Status ------ ---- ------ kaka1 Offline kaka2 Online, Local, rgmanager Service Name Owner (Last) State ------- ---- ----- ------ ----- smb-1 kaka2 started smb-2 kaka2 started However, when I power on kaka1, the trouble happens, not only "192.168.3.143" would be removed, but also kaka2 lost its original float IP "192.168.3.150". There're below errors in "/var/log/messages" on kaka2: --------------------------------------------- [root@kaka2 ~] # tail -f /var/log/messages Jul 17 17:49:24 kaka2 kernel: CMAN: node kaka1 rejoining Jul 17 17:49:33 kaka2 clurgmgrd[3393]: <info> Magma Event: Membership Change Jul 17 17:49:33 kaka2 clurgmgrd[3393]: <info> State change: kaka1 UP Jul 17 17:49:35 kaka2 clurgmgrd[3393]: <notice> Stopping service smb-1 Jul 17 17:49:36 kaka2 clurgmgrd: [3393]: <info> Removing IPv4 address 192.168.3.143 from eth0 Jul 17 17:49:44 kaka2 clurgmgrd: [3393]: <info> Executing /etc/init.d/smb status Jul 17 17:49:46 kaka2 clurgmgrd: [3393]: <info> Executing /etc/init.d/smb stop Jul 17 17:49:46 kaka2 smb: smbd shutdown succeeded Jul 17 17:49:46 kaka2 nmbd[4571]: [2007/07/17 17:49:46, 0] nmbd/nmbd.c:terminate(56) Jul 17 17:49:46 kaka2 nmbd[4571]: Got SIGTERM: going down... Jul 17 17:49:46 kaka2 nmbd[4571]: [2007/07/17 17:49:46, 0] libsmb/nmblib.c:send_udp(790) Jul 17 17:49:46 kaka2 nmbd[4571]: Packet send failed to 192.168.3.255(138) ERRNO=Invalid argument Jul 17 17:49:46 kaka2 smb: nmbd shutdown succeeded Jul 17 17:49:47 kaka2 clurgmgrd: [3393]: <info> Stopping Samba instance "samba_1" Jul 17 17:49:47 kaka2 nmbd[6736]: [2007/07/17 17:49:47, 0] nmbd/nmbd.c:terminate(56) Jul 17 17:49:47 kaka2 nmbd[6736]: Got SIGTERM: going down... Jul 17 17:49:47 kaka2 nmbd[6736]: [2007/07/17 17:49:47, 0] libsmb/nmblib.c:send_udp(790) Jul 17 17:49:47 kaka2 nmbd[6736]: Packet send failed to 192.168.3.255(138) ERRNO=Invalid argument Jul 17 17:49:47 kaka2 clurgmgrd[3393]: <notice> Service smb-1 is stopped Jul 17 17:50:14 kaka2 clurgmgrd: [3393]: <err> share_start_stop: nmbd for service died! Jul 17 17:50:14 kaka2 clurgmgrd[3393]: <notice> status on smb:samba_2 returned 255 (unspecified) Jul 17 17:50:14 kaka2 clurgmgrd[3393]: <notice> Stopping service smb-2 Jul 17 17:50:14 kaka2 clurgmgrd: [3393]: <info> Removing IPv4 address 192.168.3.150 from eth0 Jul 17 17:50:15 kaka2 nmbd[4488]: [2007/07/17 17:50:15, 0] lib/interface.c:load_interfaces(220) Jul 17 17:50:15 kaka2 nmbd[4488]: WARNING: no network interfaces found Jul 17 17:50:15 kaka2 nmbd[4488]: [2007/07/17 17:50:15, 0] nmbd/nmbd.c:reload_interfaces(265) Jul 17 17:50:15 kaka2 nmbd[4488]: reload_interfaces: No subnets to listen to. Shutting down... Jul 17 17:50:24 kaka2 clurgmgrd: [3393]: <info> Executing /etc/init.d/smb stop Jul 17 17:50:24 kaka2 smb: smbd shutdown failed Jul 17 17:50:24 kaka2 smb: nmbd shutdown failed Jul 17 17:50:24 kaka2 clurgmgrd: [3393]: <err> script:smb_script: stop of /etc/init.d/smb failed (returned 1) Jul 17 17:50:24 kaka2 clurgmgrd[3393]: <notice> stop on script:smb_script returned 1 (generic error) Jul 17 17:50:24 kaka2 clurgmgrd[3393]: <crit> #12: RG smb-2 failed to stop; intervention required Jul 17 17:50:24 kaka2 clurgmgrd[3393]: <notice> Service smb-2 is failed [root@kaka2 ~]# ip addr list 1: lo: <LOOPBACK,UP> mtu 16436 qdisc noqueue link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 brd 127.255.255.255 scope host lo inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 1000 link/ether 00:0c:29:24:0c:72 brd ff:ff:ff:ff:ff:ff inet 192.168.3.249/24 brd 192.168.3.255 scope global eth0 inet6 fe80::20c:29ff:fe24:c72/64 scope link valid_lft forever preferred_lft forever 3: sit0: <NOARP> mtu 1480 qdisc noop link/sit 0.0.0.0 brd 0.0.0.0 [root@kaka2 ~]# clustat Member Status: Quorate Member Name Status ------ ---- ------ kaka1 Online, rgmanager kaka2 Online, Local, rgmanager Service Name Owner (Last) State ------- ---- ----- ------ ----- smb-1 kaka1 started smb-2 (kaka2) failed According to active-active samba cluster, every samba service could ensure running and must be able to failover to others when it fails. While on my case, when kaka1 power on again, the samba service "smb-2" on Kaka2 failed and the float IP has also been removed. Would you please help me fix this issue? Any suggestion would be appreciated. Regards, Phillip -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster