osd crash: trim_objectcould not find coid

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Greg,

Thanks for your support!

On 08. 09. 14 20:20, Gregory Farnum wrote:

> The first one is not caused by the same thing as the ticket you
> reference (it was fixed well before emperor), so it appears to be some
> kind of disk corruption.
> The second one is definitely corruption of some kind as it's missing
> an OSDMap it thinks it should have. It's possible that you're running
> into bugs in emperor that were fixed after we stopped doing regular
> support releases of it, but I'm more concerned that you've got disk
> corruption in the stores. What kind of crashes did you see previously;
> are there any relevant messages in dmesg, etc?

Nothing special in dmesg except probably irrelevant XFS warnings:

XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250)

All logs from before the disaster are still there, do you have any
advise on what would be relevant?

> Given these issues, you might be best off identifying exactly which
> PGs are missing, carefully copying them to working OSDs (use the osd
> store tool), and killing these OSDs. Do lots of backups at each
> stage...

This sounds scary, I'll keep fingers crossed and will do a bunch of
backups. There are 17 pg with missing objects.

What do you exactly mean by the osd store tool? Is it the
'ceph_filestore_tool' binary?

Fran?ois



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux