Not able to start OSD

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I am not able to start some of the OSDs in the cluster.

This is a test cluster and had 8 OSDs. One node was taken out for maintenance. I set the noout flag and after the server came back up I unset the noout flag.

Suddenly couple of OSDs went down.

And now I can start the OSDs manually from each node, but the status is still "down"

$  ceph osd stat
8 osds: 2 up, 5 in


$ ceph osd tree
ID  CLASS WEIGHT  TYPE NAME                 STATUS REWEIGHT PRI-AFF
 -1       7.97388 root default
 -3       1.86469     host a1-osd
  1   ssd 1.86469         osd.1               down        0 1.00000
 -5       0.87320     host a2-osd
  2   ssd 0.87320         osd.2               down        0 1.00000
 -7       0.87320     host a3-osd
  4   ssd 0.87320         osd.4               down  1.00000 1.00000
 -9       0.87320     host a4-osd
  8   ssd 0.87320         osd.8                 up  1.00000 1.00000
-11       0.87320     host a5-osd
 12   ssd 0.87320         osd.12              down  1.00000 1.00000
-13       0.87320     host a6-osd
 17   ssd 0.87320         osd.17                up  1.00000 1.00000
-15       0.87320     host a7-osd
 21   ssd 0.87320         osd.21              down  1.00000 1.00000
-17       0.87000     host a8-osd
 28   ssd 0.87000         osd.28              down        0 1.00000

Also can see this error in each OSD node.

# systemctl status ceph-osd@1
● ceph-osd@1.service - Ceph object storage daemon osd.1
   Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service; enabled; vendor preset: disabled)    Active: failed (Result: start-limit) since Thu 2017-10-19 11:35:18 PDT; 19min ago   Process: 4163 ExecStart=/usr/bin/ceph-osd -f --cluster ${CLUSTER} --id %i --setuser ceph --setgroup ceph (code=killed, signal=ABRT)   Process: 4158 ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id %i (code=exited, status=0/SUCCESS)
 Main PID: 4163 (code=killed, signal=ABRT)

Oct 19 11:34:58 ceph-las1-a1-osd systemd[1]: Unit ceph-osd@1.service entered failed state.
Oct 19 11:34:58 ceph-las1-a1-osd systemd[1]: ceph-osd@1.service failed.
Oct 19 11:35:18 ceph-las1-a1-osd systemd[1]: ceph-osd@1.service holdoff time over, scheduling restart. Oct 19 11:35:18 ceph-las1-a1-osd systemd[1]: start request repeated too quickly for ceph-osd@1.service Oct 19 11:35:18 ceph-las1-a1-osd systemd[1]: Failed to start Ceph object storage daemon osd.1. Oct 19 11:35:18 ceph-las1-a1-osd systemd[1]: Unit ceph-osd@1.service entered failed state.
Oct 19 11:35:18 ceph-las1-a1-osd systemd[1]: ceph-osd@1.service failed.


_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux