Moreover if I restart the service on
the ceph-node1, which is the initial monitor and has an osd and
mds:
[root@ceph-node1 ~]# service ceph restart
=== mon.ceph-node1 ===
=== mon.ceph-node1 ===
Stopping Ceph mon.ceph-node1 on ceph-node1...kill 1215...done
=== mon.ceph-node1 ===
Starting Ceph mon.ceph-node1 on ceph-node1...
Starting ceph-create-keys on ceph-node1...
=== osd.2 ===
=== osd.2 ===
Stopping Ceph osd.2 on ceph-node1...done
=== osd.2 ===
2014-11-13 18:30:58.300930 7fef46bfd700 0 -- :/1002590 >>
192.168.122.23:6789/0 pipe(0x7fef40000c00 sd=4 :0 s=1 pgs=0 cs=0
l=1 c=0x7fef40000e90).fault
2014-11-13 18:31:10.302308 7fef4c1ce700 0 --
192.168.122.21:0/1002590 >> 192.168.122.23:6789/0
pipe(0x7fef40002000 sd=4 :0 s=1 pgs=0 cs=0 l=1
c=0x7fef40000c00).fault
2014-11-13 18:31:16.303037 7fef4c1ce700 0 --
192.168.122.21:0/1002590 >> 192.168.122.23:6789/0
pipe(0x7fef40005d30 sd=3 :0 s=1 pgs=0 cs=0 l=1
c=0x7fef400020d0).fault
failed: 'timeout 30 /usr/bin/ceph -c /etc/ceph/ceph.conf
--name=osd.2 --keyring=/var/lib/ceph/osd/ceph-2/keyring osd crush
create-or-move -- 2 0.02 host=ceph-node1 root=default'
=== mds.ceph-node1 ===
=== mds.ceph-node1 ===
Stopping Ceph mds.ceph-node1 on ceph-node1...kill 1296...done
=== mds.ceph-node1 ===
Starting Ceph mds.ceph-node1 on ceph-node1...
starting mds.ceph-node1 at :/0
[root@ceph-node1 ~]# service ceph status
=== mon.ceph-node1 ===
mon.ceph-node1: running {"version":"0.80.7"}
=== osd.2 ===
osd.2: not running.
=== mds.ceph-node1 ===
mds.ceph-node1: running {"version":"0.80.7"}
The worst part I think is this one:
failed: 'timeout 30 /usr/bin/ceph -c /etc/ceph/ceph.conf
--name=osd.2 --keyring=/var/lib/ceph/osd/ceph-2/keyring osd crush
create-or-move -- 2 0.02 host=ceph-node1 root=default'
The osd is not starting...
Cheers.
Luca
On 11/13/2014 06:33 PM, Luca Mazzaferro wrote:
Hi,
thank you for your answer:
On 11/13/2014 06:17 PM, Gregory Farnum wrote:
What does "ceph -s" output when things are working?
Does the ceph.conf on your admin node
BEFORE the problem: from ceph -w because I don't have ceph -s
[rzgceph@admin-node my-cluster]$ ceph -w
cluster 6fa39bb3-de2d-4ec5-9a86-9d96231d8b5b
health HEALTH_OK
monmap e3: 3 mons at
{ceph-node1=192.168.122.21:6789/0,ceph-node2=192.168.122.22:6789/0,ceph-node3=192.168.122.23:6789/0},
election epoch 6, quorum 0,1,2 ceph-node1,ceph-node2,ceph-node3
mdsmap e4: 1/1/1 up {0=ceph-node1=up:active}
osdmap e13: 3 osds: 3 up, 3 in
pgmap v26: 192 pgs, 3 pools, 1889 bytes data, 21 objects
103 MB used, 76655 MB / 76759 MB avail
192 active+clean
2014-11-13 17:08:43.240961 mon.0 [INF] pgmap v26: 192 pgs: 192
active+clean; 1889 bytes data, 103 MB used, 76655 MB / 76759 MB
avail; 8 B/s wr, 0 op/s
contain the address of each monitor? (Paste is the
relevant lines.) it will need to or the ceph tool won't be able
to find the monitors even though the system is working.
No only the initial one... but the documentation doesn't say it,
but it is reasonable.
I added the other two. This is my ceph.conf:
[global]
auth_service_required = cephx
filestore_xattr_use_omap = true
auth_client_required = cephx
auth_cluster_required = cephx
mon_host = 192.168.122.21 192.68.122.22 192.168.122.23
mon_initial_members = ceph-node1
fsid = 6fa39bb3-de2d-4ec5-9a86-9d96231d8b5b
osd pool default size = 2
public network = 192.168.0.0/16
and then:
ceph-deploy --overwrite-conf admin admin-node ceph-node1
ceph-node2 ceph-node3
and now:
2014-11-13 18:24:57.522590 7fa4282d1700 0 --
192.168.122.11:0/1003667 >> 192.168.122.23:6789/0
pipe(0x7fa418001d40 sd=3 :0 s=1 pgs=0 cs=0 l=1
c=0x7fa418001fb0).fault
2014-11-13 18:25:06.524145 7fa4283d2700 0 --
192.168.122.11:0/1003667 >> 192.168.122.23:6789/0
pipe(0x7fa418002fa0 sd=3 :0 s=1 pgs=0 cs=0 l=1
c=0x7fa418003210).fault
2014-11-13 18:25:12.525096 7fa4283d2700 0 --
192.168.122.11:0/1003667 >> 192.168.122.23:6789/0
pipe(0x7fa418003bf0 sd=4 :0 s=1 pgs=0 cs=0 l=1
c=0x7fa418003e60).fault
2014-11-13 18:25:21.526622 7fa4282d1700 0 --
192.168.122.11:0/1003667 >> 192.168.122.23:6789/0
pipe(0x7fa4180085a0 sd=3 :0 s=1 pgs=0 cs=0 l=1
c=0x7fa418008810).fault
2014-11-13 18:25:33.528831 7fa4284d3700 0 --
192.168.122.11:0/1003667 >> 192.168.122.23:6789/0
pipe(0x7fa4180085a0 sd=3 :0 s=1 pgs=0 cs=0 l=1
c=0x7fa418008810).fault
2014-11-13 18:25:42.530185 7fa4284d3700 0 --
192.168.122.11:0/1003667 >> 192.168.122.23:6789/0
pipe(0x7fa418009740 sd=3 :0 s=1 pgs=0 cs=0 l=1
c=0x7fa4180099b0).fault
2014-11-13 18:25:51.531688 7fa4283d2700 0 --
192.168.122.11:0/1003667 >> 192.168.122.23:6789/0
pipe(0x7fa41800a330 sd=4 :0 s=1 pgs=0 cs=0 l=1
c=0x7fa41800a5a0).fault
2014-11-13 18:26:09.534223 7fa4284d3700 0 --
192.168.122.11:0/1003667 >> 192.168.122.23:6789/0
pipe(0x7fa41800d550 sd=3 :0 s=1 pgs=0 cs=0 l=1
c=0x7fa41800e6b0).fault
Better, someone (ceph-node3) answers but not in the right way I
see.
Luca
-Greg
On Thu, Nov 13, 2014 at 9:11 AM Luca
Mazzaferro < luca.mazzaferro@xxxxxxxxxx>
wrote:
On 11/13/2014 06:05 PM, Artem Silenkov wrote:
Hello!
Only 1 monitor instance? It won't work at most
cases.
Make more and ensure quorum to reach
survivalability.
No, three monitor
instances, one for each ceph-node. As designed into the
quick-ceph-deploy.
I tried to kill one of them (the initial monitor) to see
what happens and happens that.
:-(
Ciaoo
Luca
Regards, Silenkov Artem
---
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
|