Re: fixing another remapped+incomplete EC 4+2 pg

Graham Allan <gta@xxxxxxx> · Tue, 9 Oct 2018 13:14:25 -0500

On 10/9/2018 12:19 PM, Gregory Farnum wrote:
On Wed, Oct 3, 2018 at 10:18 AM Graham Allan <gta@xxxxxxx 
<mailto:gta@xxxxxxx>> wrote:

    However I have one pg which is stuck in state remapped+incomplete
    because it has only 4 out of 6 osds running, and I have been unable to
    bring the missing two back into service.

     > PG_AVAILABILITY Reduced data availability: 1 pg inactive, 1 pg
    incomplete
     >     pg 70.82d is remapped+incomplete, acting [2147483647
    <tel:%28214%29%20748-3647>,2147483647
    <tel:%28214%29%20748-3647>,190,448,61,315] (reducing pool
    .rgw.buckets.ec42 min_size from 5 may help; search ceph.com/docs
    <http://ceph.com/docs> for 'incomplete')

    I don't think I want to do anything with min_size as that would make
    all
    other pgs vulnerable to running dangerously undersized (unless there is
    any way to force that state for only a single pg). It seems to me that
    with 4/6 osds available, it should maybe be possible to force ceph to
    select one or two new osds to rebalance this pg to?

I think unfortunately the easiest thing for you to fix this will be to 
set the min_size back to 4 until the PG is recovered (or at least has 5 
shards done). This will be fixed in a later version of Ceph and probably 
backported, but sadly it's not done yet.
-Greg
Thanks Greg, though sadly I've tried that; whatever I do, one of the 4 
osds involved will simply crash (not just the ones I previously tried to 
re-import via ceph-objectstore-tool). I just spend time chasing them 
around but never succeeding in having a complete set run long enough to 
make progress. They seem to crash when starting backfill on the next 
object. There has to be something in the current set of shards which it 
can't handle.

Since then I've been focusing on trying to get the pg to revert to an 
earlier interval using osd_find_best_info_ignore_history_les, though the 
information I find around it is minimal.

Most sources seem to suggest setting it for the primary osd then either 
setting it down or restarting it, but that simply seems to result in the 
osd disappearing from the pg. After setting this flag for all of the 
"acting" osds (most recent interval), the pg switched to having the set 
of "active" osds == "up" osds, but still "incomplete" (it's not reverted 
to the set of osds in an earlier interval). Still stuck with condition 
"peering_blocked_by_history_les_bound" at present.

I'm guessing that I actually need to set  the flag 
osd_find_best_info_ignore_history_les for *all* osds involved in the 
historical record of this pg (the "probing osds" list?), and restart 
them all...

Still also trying to understand exactly how the flag works. I think I 
see now that the "_les" bit must refer to "last epoch started"...

--
Graham Allan
Minnesota Supercomputing Institute - gta@xxxxxxx
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com