Re: OSD Restart results in "unfound objects"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I do…

 

In my case, I have collocated the MONs with some OSDs, and no later than Saturday when I lost data again, I found out that one of the MON+OSD nodes ran out of memory and started killing ceph-mon on that node…

At the same moment, all OSDs started to complain about not being able to see other OSDs on other machines.

 

I suspect that when the node runs out of memory, bad things happen with for instance the network (no memory : no network buffer ?). But I can’t explain the unfound objects, as in my case, same as yours, nodes did not crash, and ceph-osd did not crash neither – hence, I’m assuming no data was lost because of sudden disk poweroff for instance, or because of any kernel or raid controller cache…

 

For now, I’m considering moving the MONs onto dedicated nodes … hoping the out of memory was my issue.

 

De : ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] De la part de Diego Castro
Envoyé : mercredi 1 juin 2016 10:25
À : ceph-users <ceph-users@xxxxxxxx>
Objet : [ceph-users] OSD Restart results in "unfound objects"

 

Hello, i have a cluster running Jewel 10.2.0, 25 OSD's + 4 Mon.

Today my cluster suddenly went unhealth with lots of stuck pg's  due unfound objects, no disks failures nor node crashes, it just went bad.

 

I managed to put the cluster on health state again by marking lost objects to delete "ceph pg <id> mark_unfound_lost delete". 

Regarding the fact that i have no idea why the cluster gone bad, i realized restarting the osd' daemons to unlock stuck clients put the cluster on unhealth and pg gone stuck again due unfound objects.

 

Does anyone have this issue?

 

---

Diego Castro / The CloudFather
GetupCloud.com - Eliminamos a Gravidade

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux