FAILED assert(p.same_interval_since) and unusable cluster

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

I have three OSDs that are crashing on start with a FAILED assert(p.same_interval_since) error. I ran across a thread from a few days ago about the same issue and a ticket was created here: http://tracker.ceph.com/issues/21833.

A very overloaded node in my cluster OOM'd many times which eventually led to the problematic PGs and then the failed assert.

I currently have 49 pgs inactive, 33 pgs down, 15 pgs incomplete as well as 0.028% of objects unfound. Presumably due to this, I can't add any data to the FS or read some data. Just about any IO ends up in a good bit of stuck requests.

Hopefully a fix can come from the issue, but can anyone give me some suggestions or guidance to get the cluster in a working state in the meantime?

Thanks
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux