Hi Stefan,
unfortunately It doesn't start.
The failed osd (osd.0) is located on gedaopl02
[root@gedasvl02 ~]# ceph osd tree
INFO:cephadm:Inferring fsid d0920c36-2368-11eb-a5de-005056b703af
INFO:cephadm:Inferring config
/var/lib/ceph/d0920c36-2368-11eb-a5de-005056b703af/mon.gedasvl02/config
INFO:cephadm:Using recent ceph image docker.io/ceph/ceph:v15
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0.43658 root default
-7 0.21829 host gedaopl01
2 ssd 0.21829 osd.2 up 1.00000 1.00000
-3 0 host gedaopl02
-5 0.21829 host gedaopl03
3 ssd 0.21829 osd.3 up 1.00000 1.00000
0 0 osd.0 down 0 1.00000
[root@gedaopl02 ~]# systemctl --failed
UNIT LOAD ACTIVE SUB DESCRIPTION
● ceph-d0920c36-2368-11eb-a5de-005056b703af@mgr.gedaopl02.pijxbm.service
loaded failed failed Ceph mgr.gedaopl02.pijxbm for
d0920c36-2368-11eb-a5de-005056b703af
● ceph-d0920c36-2368-11eb-a5de-005056b703af@osd.0.service loaded failed
failed Ceph osd.0 for d0920c36-2368-11eb-a5de-005056b703af
● ceph-d0920c36-2368-11eb-a5de-005056b703af@osd.1.service loaded failed
failed Ceph osd.1 for d0920c36-2368-11eb-a5de-005056b703af
LOAD = Reflects whether the unit definition was properly loaded.
ACTIVE = The high-level unit activation state, i.e. generalization of SUB.
SUB = The low-level unit activation state, values depend on unit type.
3 loaded units listed. Pass --all to see loaded but inactive units, too.
To show all installed unit files use 'systemctl list-unit-files'.
I can start the service but then after a minute or so it fails. Maybe
I'm looking at the wrong log file, but it's empty:
[root@gedaopl02 ~]# tail -f
/var/log/ceph/d0920c36-2368-11eb-a5de-005056b703af/ceph-osd.0.log
Yesterday when I deleted the failed osd and recreated it there were lots
of message in the log file:
https://pastebin.com/5hH27pdR
Cheers,
Oliver
Am 01.12.2020 um 09:22 schrieb Stefan Kooman:
On 2020-11-30 15:55, Oliver Weinmann wrote:
I have another error "pgs undersized", maybe this is also causing trouble?
This is a result of the loss of one OSD, and the PGs located on it. As
you only have 1 OSDs left, the cluster cannot recover on a third OSD
(assuming defaults here). The cluster will heal itself as soon as the
third OSD will be back online.
Can you start the OSD? If not, can you provide logs of the failing OSD?
Gr. Stefan
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx