What does
# ceph tell osd.* version
reveal? Any pre-v0.94.4 hammer OSDs running as the error states?
On Tue, Mar 28, 2017 at 1:21 AM, Jaime Ibar <jaime@xxxxxxxxxxxx> wrote:
Hi,
I did change the ownership to user ceph. In fact, OSD processes are running
ps aux | grep ceph
ceph 2199 0.0 2.7 1729044 918792 ? Ssl Mar27 0:21 /usr/bin/ceph-osd --cluster=ceph -i 42 -f --setuser ceph --setgroup ceph
ceph 2200 0.0 2.7 1721212 911084 ? Ssl Mar27 0:20 /usr/bin/ceph-osd --cluster=ceph -i 18 -f --setuser ceph --setgroup ceph
ceph 2212 0.0 2.8 1732532 926580 ? Ssl Mar27 0:20 /usr/bin/ceph-osd --cluster=ceph -i 3 -f --setuser ceph --setgroup ceph
ceph 2215 0.0 2.8 1743552 935296 ? Ssl Mar27 0:20 /usr/bin/ceph-osd --cluster=ceph -i 47 -f --setuser ceph --setgroup ceph
ceph 2341 0.0 2.7 1715548 908312 ? Ssl Mar27 0:20 /usr/bin/ceph-osd --cluster=ceph -i 51 -f --setuser ceph --setgroup ceph
ceph 2383 0.0 2.7 1694944 893768 ? Ssl Mar27 0:20 /usr/bin/ceph-osd --cluster=ceph -i 56 -f --setuser ceph --setgroup ceph
[...]
If I run one of the osd increasing debug
ceph-osd --debug_osd 5 -i 31
this is what I get in logs
[...]
0 osd.31 14016 done with init, starting boot process
2017-03-28 09:19:15.280182 7f083df0c800 1 osd.31 14016 We are healthy, booting
2017-03-28 09:19:15.280685 7f081cad3700 1 osd.31 14016 osdmap indicates one or more pre-v0.94.4 hammer OSDs is running
[...]
It seems the osd is running but ceph is not aware of it
Thanks
Jaime
On 27/03/17 21:56, George Mihaiescu wrote:
Make sure the OSD processes on the Jewel node are running. If you didn't change the ownership to user ceph, they won't start.
On Mar 27, 2017, at 11:53, Jaime Ibar <jaime@xxxxxxxxxxxx> wrote:
Hi all,
I'm upgrading ceph cluster from Hammer 0.94.9 to jewel 10.2.6.
The ceph cluster has 3 servers (one mon and one mds each) and another 6 servers with
12 osds each.
The monitoring and mds have been succesfully upgraded to latest jewel release, however
after upgrade the first osd server(12 osds), ceph is not aware of them and
are marked as down
ceph -s
cluster 4a158d27-f750-41d5-9e7f-26ce4c9d2d45
health HEALTH_WARN
[...]
12/72 in osds are down
noout flag(s) set
osdmap e14010: 72 osds: 60 up, 72 in; 14641 remapped pgs
flags noout
[...]
ceph osd tree
3 3.64000 osd.3 down 1.00000 1.00000
8 3.64000 osd.8 down 1.00000 1.00000
14 3.64000 osd.14 down 1.00000 1.00000
18 3.64000 osd.18 down 1.00000 1.00000
21 3.64000 osd.21 down 1.00000 1.00000
28 3.64000 osd.28 down 1.00000 1.00000
31 3.64000 osd.31 down 1.00000 1.00000
37 3.64000 osd.37 down 1.00000 1.00000
42 3.64000 osd.42 down 1.00000 1.00000
47 3.64000 osd.47 down 1.00000 1.00000
51 3.64000 osd.51 down 1.00000 1.00000
56 3.64000 osd.56 down 1.00000 1.00000
If I run this command with one of the down osd
ceph osd in 14
osd.14 is already in.
however ceph doesn't mark it as up and the cluster health remains
in degraded state.
Do I have to upgrade all the osds to jewel first?
Any help as I'm running out of ideas?
Thanks
Jaime
--
Jaime Ibar
High Performance & Research Computing, IS Services
Lloyd Building, Trinity College Dublin, Dublin 2, Ireland.
http://www.tchpc.tcd.ie/ | jaime@xxxxxxxxxxxx
Tel: +353-1-896-3725
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
Jaime Ibar
High Performance & Research Computing, IS Services
Lloyd Building, Trinity College Dublin, Dublin 2, Ireland.
http://www.tchpc.tcd.ie/ | jaime@xxxxxxxxxxxx
Tel: +353-1-896-3725
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Brian Andrus | Cloud Systems Engineer | DreamHost
brian.andrus@xxxxxxxxxxxxx | www.dreamhost.com
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com