Re: Recover unfound objects from crashed OSD's underlying filesystem

Gregory Farnum <gfarnum@xxxxxxxxxx> · Wed, 17 Feb 2016 16:54:34 -0800



On Wed, Feb 17, 2016 at 4:44 PM, Kostis Fardelas <dante1234@xxxxxxxxx> wrote:
> Thanks Greg,
> I gather from reading about ceph_objectstore_tool that it acts at the
> level of the PG. The fact is that I do not want to wipe the whole PG,
> only export certain objects (the unfound ones) and import them again
> into the cluster. To be precise the pg with the unfound objects is
> mapped like this:
> osdmap e257960 pg 3.5a9 (3.5a9) -> up [86,30] acting [86]
>
> but by searching in the underlying filesystem of the crahed OSD, I can
> verify that it contains the 4MB unfound objects which I get with pg
> list_missing and cannot be found on every other probed OSD.
>
> Do you know if and how could I achieve this with ceph_objectstore_tool?

You can't just pull out single objects. What you can do is export the
entire PG containing the objects, import it into a random OSD, and
then let the cluster recover from that OSD.
(Assuming all the data you need is there — just because you can see
the files on disk doesn't mean all the separate metadata is
available.)
-Greg

>
> Regards,
> Kostis
>
>
> On 18 February 2016 at 01:22, Gregory Farnum <gfarnum@xxxxxxxxxx> wrote:
>> On Wed, Feb 17, 2016 at 3:05 PM, Kostis Fardelas <dante1234@xxxxxxxxx> wrote:
>>> Hello cephers,
>>> due to an unfortunate sequence of events (disk crashes, network
>>> problems), we are currently in a situation with one PG that reports
>>> unfound objects. There is also an OSD which cannot start-up and
>>> crashes with the following:
>>>
>>> 2016-02-17 18:40:01.919546 7fecb0692700 -1 os/FileStore.cc: In
>>> function 'virtual int FileStore::read(coll_t, const ghobject_t&,
>>> uint64_t, size_t, ceph::bufferlist&, bool)
>>> ' thread 7fecb0692700 time 2016-02-17 18:40:01.889980
>>> os/FileStore.cc: 2650: FAILED assert(allow_eio ||
>>> !m_filestore_fail_eio || got != -5)
>>>
>>> (There is probably a problem with the OSD's underlying disk storage)
>>>
>>> By querying the PG that is stuck in
>>> active+recovering+degraded+remapped state due to the unfound objects,
>>> I understand that all possible OSDs are probed except for the crashed
>>> one:
>>>
>>> "might_have_unfound": [
>>>   { "osd": "30",
>>>    "status": "already probed"},
>>>   { "osd": "102",
>>>    "status": "already probed"},
>>>   { "osd": "104",
>>>    "status": "osd is down"},
>>>   { "osd": "105",
>>>    "status": "already probed"},
>>>   { "osd": "145",
>>>     "status": "already probed"}],
>>>
>>> so I understand that the crashed OSD may have the latest version of
>>> the objects. I can also verify that I I can find the 4MB objects in
>>> the underlying filesystem of the crashed OSD.
>>>
>>> By issuing ceph pg 3.5a9 list_missing, I get for all unfound objects,
>>> information like this:
>>>
>>>         { "oid": { "oid":
>>> "829d5be29cd6e231e7e951ba58ad3d0baf7fba65aad40083cef39bb03d5ec0fd",
>>>               "key": "",
>>>               "snapid": -2,
>>>               "hash": 3880052137,
>>>               "max": 0,
>>>               "pool": 3,
>>>               "namespace": ""},
>>>           "need": "255658'37078125",
>>>           "have": "255651'37077081",
>>>           "locations": []}
>>>
>>>
>>> My question is what is the best solution that I should follow?
>>> a. Is there any way to export the objects from the crashed OSD's
>>> filesystem and reimport it to the cluster? How could that be done?
>>
>> Look at ceph_objecstore_tool. eg,
>> http://ceph-users.ceph.narkive.com/lwDkR2fZ/recovering-incomplete-pgs-with-ceph-objectstore-tool
>>
>>> b. If I issue ceph pg {pg-id} mark_unfound_lost revert, should I
>>> expect that the "have" version of this object (thus an older version
>>> of the object) will become enabled?
>>
>> It should, although I gather this sometimes takes some contortions for
>> reasons I've never worked out.
>> -Greg
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com