Ceph cluster works UNTIL the OSDs are rebooted

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I had a working ceph cluster running nautilus in a test lab just a few months ago. Now that I'm trying to take ceph live on production hardware, I can't seem to get the cluster to stay up and available even though all three OSDs are UP and IN.

I believe the problem is that the OSDs don't mount their volumes after a reboot. The ceph-deploy routine can install an OSD node, format the disk and bring it online, and it can get all the OSD nodes UP and IN and reach a quorum BUT, once an OSD gets rebooted, all the PGs related to that OSD go "stuck inactive...current state unknown, last acting".

I've found and resolved all my hostname and firewall errors, and I'm comfortable that I've ruled out network issues. For grins and giggles, I reconfigured the OSDs to be on the same 'public' network with the MON servers and the OSDs still drop their disks from the cluster after a reboot.

What do I need to do next?

Below is a pastebin link to some log file data where you can see some traceback errors.

----
[2019-10-30 14:52:10,201][ceph_volume][ERROR ] exception caught by decorator
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/ceph_volume/decorators.py", line 59, in newfunc
    return f(*a, **kw)
----

Some of these errors might be due to the system seeing the three other setup attempts that are no longer available. A 'ceph-deploy purge' and 'ceph-deploy purgedata' doesn't seem to get rid of EVERYTHING. I've learned since that /var/lib/ceph retains some data. I'll be sure to remove the data from that directory when I next attempt to start fresh.

What do I need to be looking at to correct this "OSD not remounting it's disk" issue?

https://pastebin.com/NMXvYBcZ
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux