Re: 14.2.1 OSDs crash and sometimes fail to start back up, workaround

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Slight correction. I removed and added back only the OSDs that were crashing.
I noticed it seemed to be only certain OSDs that were crashing. Once they were rebuilt, they stopped crashing.

Further info, We originally had deployed Luminous code, upgraded to mimic, then upgraded to nautilus.
Perhaps there was issues with OSDs related to upgrades? I don’t know.
Perhaps a clean install of 14.2.1 would not have done this? I don’t know.

-Ed

> On Jul 12, 2019, at 11:32 AM, Edward Kalk <ekalk@xxxxxxxxxx> wrote:
> 
> It seems that I have been able to workaround my issues.
> I’ve attempted to reproduce by rebooting nodes and using the stop all OSDs wait a bit and start them.
> At this time, no OSDs are crashing like before. OSDs seem to have no problems starting either.
> What I did is remove completely the OSDs one at a time and reissue them allowing CEPH 14.2.1 to reengineer them.
> <remove and reuse a disk.txt> I have attached my doc I use to accomplish this. *BEfore I do it, I mark the OSD as “out” via the GUI or CLI and allow it to reweight to 0%, this is monitored via Ceph -s. I do this so that there is not an actual disk fail which then puts me into dual disk fail when I’m rebuilding an OSD.
> 
> -Edward Kalk
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux