Jeff,
First, if you've specified the public and cluster networks in [global], you don't need to specify it anywhere else. If you do, they get overridden. That's not the issue here. It appears from your ceph.conf file that you've specified an address on the cluster network. Specifically, you specified mon addr = 10.100.10.1:6789, but you indicated elsewhere that this IP address belongs to the cluster network.
On Mon, Jan 13, 2014 at 11:29 AM, Jeff Bachtel <jbachtel@xxxxxxxxxxxxxxxxxxxxxx> wrote:
I've got a cluster with 3 mons, all of which are binding solely to a cluster network IP, and neither to 0.0.0.0:6789 nor a public IP. I hadn't noticed the problem until now because it makes little difference in how I normally use Ceph (rbd and radosgw), but now that I'm trying to use cephfs it's obviously suboptimal.
[global]
auth cluster required = cephx
auth service required = cephx
auth client required = cephx
keyring = /etc/ceph/keyring
cluster network = 10.100.10.0/24
public network = 10.100.0.0/21
public addr = 10.100.0.150
cluster addr = 10.100.10.1
fsid = de10594a-0737-4f34-a926-58dc9254f95f
[mon]
cluster network = 10.100.10.0/24
public network = 10.100.0.0/21
mon data = "">
[mon.controller1]
host = controller1
mon addr = 10.100.10.1:6789
public addr = 10.100.0.150
cluster addr = 10.100.10.1
cluster network = 10.100.10.0/24
public network = 10.100.0.0/21
And then with /usr/bin/ceph-mon -i controller1 --debug_ms 12 --pid-file /var/run/ceph/mon.controller1.pid -c /etc/ceph/ceph.conf I get in logs
2014-01-13 14:19:13.578458 7f195e6d97a0 0 ceph version 0.72.2 (a913ded2ff138aefb8cb84d347d72164099cfd60), process ceph-mon, pid 7559
2014-01-13 14:19:13.641639 7f195e6d97a0 10 -- :/0 rank.bind 10.100.10.1:6789/0
2014-01-13 14:19:13.641668 7f195e6d97a0 10 accepter.accepter.bind
2014-01-13 14:19:13.642773 7f195e6d97a0 10 accepter.accepter.bind bound to 10.100.10.1:6789/0
2014-01-13 14:19:13.642800 7f195e6d97a0 1 -- 10.100.10.1:6789/0 learned my addr 10.100.10.1:6789/0
2014-01-13 14:19:13.642808 7f195e6d97a0 1 accepter.accepter.bind my_inst.addr is 10.100.10.1:6789/0 need_addr=0
Whith no mention of public addr (10.100.2.1) or public network (10.100.0.0/21) found. mds (on this host) and osd (on other hosts) bind to 0.0.0.0 and a public IP, respectively.
At this point public/cluster addr/network are WAY overspecified in ceph.conf, but the problem appeared with far less specification.
Any ideas? Thanks,
Jeff
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
John Wilkins
Senior Technical Writer
Intank
john.wilkins@xxxxxxxxxxx
(415) 425-9599
http://inktank.com
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com