Re: How to just delete PGs stuck incomplete on EC pool

Peter Woodman <peter@xxxxxxxxxxxx> · Tue, 5 Mar 2019 09:24:05 -0500



Last time I had to do this, I used the command outlined here:
https://tracker.ceph.com/issues/10098

On Mon, Mar 4, 2019 at 11:05 AM Daniel K <sathackr@xxxxxxxxx> wrote:
>
> Thanks for the suggestions.
>
> I've tried both -- setting osd_find_best_info_ignore_history_les = true and restarting all OSDs,  as well as 'ceph osd-force-create-pg' -- but both still show incomplete
>
> PG_AVAILABILITY Reduced data availability: 2 pgs inactive, 2 pgs incomplete
>     pg 18.c is incomplete, acting [32,48,58,40,13,44,61,59,30,27,43,37] (reducing pool ec84-hdd-zm min_size from 8 may help; search ceph.com/docs for 'incomplete')
>     pg 18.1e is incomplete, acting [50,49,41,58,60,46,52,37,34,63,57,16] (reducing pool ec84-hdd-zm min_size from 8 may help; search ceph.com/docs for 'incomplete')
>
>
> The OSDs in down_osds_we_would_probe have already been marked lost
>
> When I ran  the force-create-pg command, they went to peering for a few seconds, but then went back incomplete.
>
> Updated ceph pg 18.1e query https://pastebin.com/XgZHvJXu
> Updated ceph pg 18.c query https://pastebin.com/N7xdQnhX
>
> Any other suggestions?
>
>
>
> Thanks again,
>
> Daniel
>
>
>
> On Sat, Mar 2, 2019 at 3:44 PM Paul Emmerich <paul.emmerich@xxxxxxxx> wrote:
>>
>> On Sat, Mar 2, 2019 at 5:49 PM Alexandre Marangone
>> <a.marangone@xxxxxxxxx> wrote:
>> >
>> > If you have no way to recover the drives, you can try to reboot the OSDs with `osd_find_best_info_ignore_history_les = true` (revert it afterwards), you'll lose data. If after this, the PGs are down, you can mark the OSDs blocking the PGs from become active lost.
>>
>> this should work for PG 18.1e, but not for 18.c. Try running "ceph osd
>> force-create-pg <pgid>" to reset the PGs instead.
>> Data will obviously be lost afterwards.
>>
>> Paul
>>
>> >
>> > On Sat, Mar 2, 2019 at 6:08 AM Daniel K <sathackr@xxxxxxxxx> wrote:
>> >>
>> >> They all just started having read errors. Bus resets. Slow reads. Which is one of the reasons the cluster didn't recover fast enough to compensate.
>> >>
>> >> I tried to be mindful of the drive type and specifically avoided the larger capacity Seagates that are SMR. Used 1 SM863 for every 6 drives for the WAL.
>> >>
>> >> Not sure why they failed. The data isn't critical at this point, just need to get the cluster back to normal.
>> >>
>> >> On Sat, Mar 2, 2019, 9:00 AM <jesper@xxxxxxxx> wrote:
>> >>>
>> >>> Did they break, or did something went wronng trying to replace them?
>> >>>
>> >>> Jespe
>> >>>
>> >>>
>> >>>
>> >>> Sent from myMail for iOS
>> >>>
>> >>>
>> >>> Saturday, 2 March 2019, 14.34 +0100 from Daniel K <sathackr@xxxxxxxxx>:
>> >>>
>> >>> I bought the wrong drives trying to be cheap. They were 2TB WD Blue 5400rpm 2.5 inch laptop drives.
>> >>>
>> >>> They've been replace now with HGST 10K 1.8TB SAS drives.
>> >>>
>> >>>
>> >>>
>> >>> On Sat, Mar 2, 2019, 12:04 AM <jesper@xxxxxxxx> wrote:
>> >>>
>> >>>
>> >>>
>> >>> Saturday, 2 March 2019, 04.20 +0100 from sathackr@xxxxxxxxx <sathackr@xxxxxxxxx>:
>> >>>
>> >>> 56 OSD, 6-node 12.2.5 cluster on Proxmox
>> >>>
>> >>> We had multiple drives fail(about 30%) within a few days of each other, likely faster than the cluster could recover.
>> >>>
>> >>>
>> >>> Hov did so many drives break?
>> >>>
>> >>> Jesper
>> >>
>> >> _______________________________________________
>> >> ceph-users mailing list
>> >> ceph-users@xxxxxxxxxxxxxx
>> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >
>> > _______________________________________________
>> > ceph-users mailing list
>> > ceph-users@xxxxxxxxxxxxxx
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com