Re: Recover unfound objects from crashed OSD's underlying filesystem

Kostis Fardelas <dante1234@xxxxxxxxx> · Thu, 18 Feb 2016 02:44:59 +0200

Thanks Greg,
I gather from reading about ceph_objectstore_tool that it acts at the
level of the PG. The fact is that I do not want to wipe the whole PG,
only export certain objects (the unfound ones) and import them again
into the cluster. To be precise the pg with the unfound objects is
mapped like this:
osdmap e257960 pg 3.5a9 (3.5a9) -> up [86,30] acting [86]

but by searching in the underlying filesystem of the crahed OSD, I can
verify that it contains the 4MB unfound objects which I get with pg
list_missing and cannot be found on every other probed OSD.

Do you know if and how could I achieve this with ceph_objectstore_tool?

Regards,
Kostis

On 18 February 2016 at 01:22, Gregory Farnum <gfarnum@xxxxxxxxxx> wrote:
> On Wed, Feb 17, 2016 at 3:05 PM, Kostis Fardelas <dante1234@xxxxxxxxx> wrote:
>> Hello cephers,
>> due to an unfortunate sequence of events (disk crashes, network
>> problems), we are currently in a situation with one PG that reports
>> unfound objects. There is also an OSD which cannot start-up and
>> crashes with the following:
>>
>> 2016-02-17 18:40:01.919546 7fecb0692700 -1 os/FileStore.cc: In
>> function 'virtual int FileStore::read(coll_t, const ghobject_t&,
>> uint64_t, size_t, ceph::bufferlist&, bool)
>> ' thread 7fecb0692700 time 2016-02-17 18:40:01.889980
>> os/FileStore.cc: 2650: FAILED assert(allow_eio ||
>> !m_filestore_fail_eio || got != -5)
>>
>> (There is probably a problem with the OSD's underlying disk storage)
>>
>> By querying the PG that is stuck in
>> active+recovering+degraded+remapped state due to the unfound objects,
>> I understand that all possible OSDs are probed except for the crashed
>> one:
>>
>> "might_have_unfound": [
>>   { "osd": "30",
>>    "status": "already probed"},
>>   { "osd": "102",
>>    "status": "already probed"},
>>   { "osd": "104",
>>    "status": "osd is down"},
>>   { "osd": "105",
>>    "status": "already probed"},
>>   { "osd": "145",
>>     "status": "already probed"}],
>>
>> so I understand that the crashed OSD may have the latest version of
>> the objects. I can also verify that I I can find the 4MB objects in
>> the underlying filesystem of the crashed OSD.
>>
>> By issuing ceph pg 3.5a9 list_missing, I get for all unfound objects,
>> information like this:
>>
>>         { "oid": { "oid":
>> "829d5be29cd6e231e7e951ba58ad3d0baf7fba65aad40083cef39bb03d5ec0fd",
>>               "key": "",
>>               "snapid": -2,
>>               "hash": 3880052137,
>>               "max": 0,
>>               "pool": 3,
>>               "namespace": ""},
>>           "need": "255658'37078125",
>>           "have": "255651'37077081",
>>           "locations": []}
>>
>>
>> My question is what is the best solution that I should follow?
>> a. Is there any way to export the objects from the crashed OSD's
>> filesystem and reimport it to the cluster? How could that be done?
>
> Look at ceph_objecstore_tool. eg,
> http://ceph-users.ceph.narkive.com/lwDkR2fZ/recovering-incomplete-pgs-with-ceph-objectstore-tool
>
>> b. If I issue ceph pg {pg-id} mark_unfound_lost revert, should I
>> expect that the "have" version of this object (thus an older version
>> of the object) will become enabled?
>
> It should, although I gather this sometimes takes some contortions for
> reasons I've never worked out.
> -Greg
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com