Re: Inactive PGs

Wido den Hollander <wido@xxxxxxxx> · Fri, 13 Mar 2020 17:47:18 +0100

On 3/13/20 5:44 PM, Peter Eisch wrote:
> 
> 
> 
> Peter Eisch
> Senior Site Reliability Engineer
> 
> T
> 
> 	*1.612.445.5135* <tel:1.612.445.5135>
> 
> Facebook <https://www.facebook.com/VirginPulse>
> 
> 	
> LinkedIn <https://www.linkedin.com/company/virgin-pulse>
> 
> 	
> Twitter <https://twitter.com/virginpulse>
> 
> *virginpulse.com* <https://www.virginpulse.com/>
> 	
> |
> 
> 	*virginpulse.com/global-challenge*
> <https://www.virginpulse.com/en-gb/global-challenge/>
> 
> Australia | Bosnia and Herzegovina | Brazil | Canada | Singapore | Switzerland | United Kingdom | USA
> 
> Confidentiality Notice: The information contained in this e-mail,
> including any attachment(s), is intended solely for use by the
> designated recipient(s). Unauthorized use, dissemination, distribution,
> or reproduction of this message by anyone other than the intended
> recipient(s), or a person designated as responsible for delivering such
> messages to the intended recipient, is strictly prohibited and may be
> unlawful. This e-mail may contain proprietary, confidential or
> privileged information. Any views or opinions expressed are solely those
> of the author and do not necessarily represent those of Virgin Pulse,
> Inc. If you have received this message in error, or are not the named
> recipient(s), please immediately notify the sender and delete this
> e-mail message.
> 
> v2.64
> 
> On 3/13/20, 11:38 AM, "Wido den Hollander" <wido@xxxxxxxx> wrote:
> 
> This email originates outside Virgin Pulse.
> 
> 
> On 3/13/20 4:09 PM, Peter Eisch wrote:
>> Full cluster is 14.2.8.
>>
>> I had some OSD drop overnight which results now in 4 inactive PGs. The
>> pools had three participant (2 ssd, 1 sas) OSDs. In each pool at least 1
>> ssd and 1 sas OSD is working without issue. I’ve ‘ceph pg repair <pg>’
>> but it doesn’t seem to make any changes.
>>
>> PG_AVAILABILITY Reduced data availability: 4 pgs inactive, 4 pgs
> incomplete
>> pg 10.2e is incomplete, acting [59,67]
>> pg 10.c3 is incomplete, acting [62,105]
>> pg 10.f3 is incomplete, acting [62,59]
>> pg 10.1d5 is incomplete, acting [87,106]
>>
>> Using `ceph pg <pg> query` I can see the OSD in each case of the ones
>> which failed. Respectively they are:
>> pg 10.2e participants: 59, 68, 77, 143
>> pg 10.c3 participants: 60, 62, 85, 102, 105, 106
>> pg 10.f3 participants: 59, 64, 75, 107
>> pg 10.1d5 participants: 64, 77, 87, 106
>>
>> The OSDs which are now down/out and have been removed from the crush map
>> and removed the auth are:
>> 62, 64, 68
>>
>> Of course I have lots of reports of slow OSDs now from OSDs worried
>> about the inactive PGs.
>>
>> How do I properly kick these PGs to have them drop their usage of the
>> OSDs which no longer exist?
> 
> You don't. Because those OSDs hold the data you need.
> 
> Why did you remove them from the CRUSHMap, OSDMap and auth? As you need
> these to rebuild the PGs.
> 
> Wido
> 
> The drives failed at a hardware level. I've replaced OSDs with this by
> either planned migration or failure in previous instances without issue.
> I didn't realize all the replicated copies were on just one drive in
> each pool.
> > What should my actions have been in this case?

Try to get those OSDs online again. Maybe try a rescue of the disks or
see how the OSDs would be able to start.

A tool like dd_rescue can help in getting such a thing done.

> 
> pool 10 volumes' replicated size 2 min_size 1 crush_rule 1 object_hash
> rjenkins pg_num 512 pgp_num 512 autoscale_mode warn last_change 47570
> lfor 0/0/40781 flags hashpspool,selfmanaged_snaps stripe_width 0
> application rbd

I see you use 2x replication with min_size=1, that's dangerous and can
easily lead to data loss.

I wouldn't say it's impossible to get the data back, but something like
this can take a while (a lot of hours) to be brought back online.

Wido

> 
> Crush rule 1:
> rule ssd_by_host {
> id 1
> type replicated
> min_size 1
> max_size 10
> step take default class ssd
> step chooseleaf firstn 0 type host
> step emit
> }
> 
> peter
> 
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx