Re: Crashing OSDs (suicide timeout, following a single pool)

Adam Tygart <mozes@xxxxxxx> · Thu, 2 Jun 2016 22:32:25 -0500

I'm still exporting pgs out of some of the downed osds, but things are
definitely looking promising.

Marginally related to this thread, as these seem to be most of the
hanging objects when exporting pgs, what are inodes in the 600 range
used for within the metadata pool? I know the 200 range is used for
journaling. 8 of the 13 osds I've got left down are currently trying
to export objects in the 600 range. Are these just MDS journal objects
from an mds severely behind on trimming?

--
Adam

On Thu, Jun 2, 2016 at 6:10 PM, Brad Hubbard <bhubbard@xxxxxxxxxx> wrote:
> On Thu, Jun 2, 2016 at 9:07 AM, Brandon Morris, PMP
> <brandon.morris.pmp@xxxxxxxxx> wrote:
>
>> The only way that I was able to get back to Health_OK was to export/import.  ***** Please note, any time you use the ceph_objectstore_tool you risk data loss if not done carefully.   Never remove a PG until you have a known good export *****
>>
>> Here are the steps I used:
>>
>> 1. set NOOUT, NO BACKFILL
>> 2. Stop the OSD's that have the erroring PG
>> 3. Flush the journal and export the primary version of the PG.  This took 1 minute on a well-behaved PG and 4 hours on the misbehaving PG
>>   i.e.   ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-16 --journal-path /var/lib/ceph/osd/ceph-16/journal --pgid 32.10c --op export --file /root/32.10c.b.export
>>
>> 4. Import the PG into a New / Temporary OSD that is also offline,
>>   i.e.   ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-100 --journal-path /var/lib/ceph/osd/ceph-100/journal --pgid 32.10c --op export --file /root/32.10c.b.export
>
> This should be an import op and presumably to a different data path
> and journal path more like the following?
>
> ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-101
> --journal-path /var/lib/ceph/osd/ceph-101/journal --pgid 32.10c --op
> import --file /root/32.10c.b.export
>
> Just trying to clarify for anyone that comes across this thread in the future.
>
> Cheers,
> Brad
>
>>
>> 5. remove the PG from all other OSD's  (16, 143, 214, and 448 in your case it looks like)
>> 6. Start cluster OSD's
>> 7. Start the temporary OSD's and ensure 32.10c backfills correctly to the 3 OSD's it is supposed to be on.
>>
>> This is similar to the recovery process described in this post from 04/09/2015: http://ceph-users.ceph.narkive.com/lwDkR2fZ/recovering-incomplete-pgs-with-ceph-objectstore-tool   Hopefully it works in your case too and you can the cluster back to a state that you can make the CephFS directories smaller.
>>
>> - Brandon
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com