On Mon, Jul 07, 2008 at 10:48:28AM -0500, David Teigland wrote: > On Sun, Jul 06, 2008 at 05:51:05PM -0400, J. Bruce Fields wrote: > > - write(control_fd, in, sizeof(struct gdlm_plock_info)); > > + write(control_fd, in, sizeof(struct dlm_plock_info)); > > Gah, sorry, I keep fixing that and it keeps reappearing. > > > > Jul 1 14:06:42 piglet2 kernel: dlm: connect from non cluster node > > > It looks like dlm_new_workspace() is waiting on dlm_recoverd, which is > > in "D" state in dlm_rcom_status(), so I guess the second node isn't > > getting some dlm reply it expects? > > dlm inter-node communication is not working here for some reason. There > must be something unusual with the way the network is configured on the > nodes, and/or a problem with the way the cluster code is applying the > network config to the dlm. > > Ah, I just remembered what this sounds like; we see this kind of thing > when a network interface has multiple IP addresses, and/or routing is > configured strangely. Others cc'ed could offer better details on exactly > what to look for. OK, thanks! I'm trying to run gfs2 on 4 kvm machines, I'm an expert on neither, and it's entirely likely there's some obvious misconfiguration. On the kvm host there are 4 virtual interfaces bridged together: bfields@pig:~$ brctl show bridge name bridge id STP enabled interfaces vnet0 8000.00ff0823c0f3 yes vnet1 vnet2 vnet3 vnet4 vnet0 has address 192.168.122.1 on the host, and the 4 kvm guests are statically assigned addresses 129, 130, 131, and 132 on the 192.168.122.* network, so a kvm guest looks like: piglet1:~# ifconfig eth1 Link encap:Ethernet HWaddr 00:16:3e:16:4d:61 inet addr:192.168.122.129 Bcast:192.168.122.255 Mask:255.255.255.0 inet6 addr: fe80::216:3eff:fe16:4d61/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:2464 errors:0 dropped:0 overruns:0 frame:0 TX packets:1806 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:197099 (192.4 KiB) TX bytes:165606 (161.7 KiB) Interrupt:11 Base address:0xc100 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:285 errors:0 dropped:0 overruns:0 frame:0 TX packets:285 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:13394 (13.0 KiB) TX bytes:13394 (13.0 KiB) piglet1:~# cat /etc/hosts 127.0.0.1 localhost 192.168.122.129 piglet1 192.168.122.130 piglet2 192.168.122.131 piglet3 192.168.122.132 piglet4 # The following lines are desirable for IPv6 capable hosts ::1 ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters ff02::3 ip6-allhosts The network setup looks otherwise fine--they can all ping each other and the outside world. --b. -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster