Hi Greg, Thanks for your support! On 08. 09. 14 20:20, Gregory Farnum wrote: > The first one is not caused by the same thing as the ticket you > reference (it was fixed well before emperor), so it appears to be some > kind of disk corruption. > The second one is definitely corruption of some kind as it's missing > an OSDMap it thinks it should have. It's possible that you're running > into bugs in emperor that were fixed after we stopped doing regular > support releases of it, but I'm more concerned that you've got disk > corruption in the stores. What kind of crashes did you see previously; > are there any relevant messages in dmesg, etc? Nothing special in dmesg except probably irrelevant XFS warnings: XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250) All logs from before the disaster are still there, do you have any advise on what would be relevant? > Given these issues, you might be best off identifying exactly which > PGs are missing, carefully copying them to working OSDs (use the osd > store tool), and killing these OSDs. Do lots of backups at each > stage... This sounds scary, I'll keep fingers crossed and will do a bunch of backups. There are 17 pg with missing objects. What do you exactly mean by the osd store tool? Is it the 'ceph_filestore_tool' binary? Fran?ois