Re: PG does not become active

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Update: the inactive PG got recovered and active after a loooonngg wait. The middle question is now answered. However, these two questions are still of great worry:

- How can 2 OSDs be missing if only 1 OSD is down?
- If the PG should recover, why is it not prioritised considering its severe degradation
  compared with all other PGs?

I don't understand how a PG can loose 2 shards if 1 OSD goes down. That looks really really bad to me (did ceph loose track of data??).

The second is of no less importance. The inactive PG was holding back client IO, leading to further warnings about slow OPS/requests/... Why are such critically degraded PGs not scheduled for recovery first? There is a service outage but only a health warning?

Thanks and best regards.
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: Frank Schilder <frans@xxxxxx>
Sent: 27 July 2022 17:19:05
To: ceph-users@xxxxxxx
Subject:  PG does not become active

I'm testing octopus 15.2.16 and run into a problem right away. I'm filling up a small test cluster with 3 hosts 3x3 OSDs and killed one OSD to see how recovery works. I have one 4+2 EC pool with failure domain host and on 1 PGs of this pool 2 (!!!) shards are missing. This most degraded PG is not becoming active, its stuck inactive but peered.

Questions:

- How can 2 OSDs be missing if only 1 OSD is down?
- Wasn't there an important code change to allow recovery for an EC PG with at
  least k shards present even if min_size>k? Do I have to set something?
- If the PG should recover, why is it not prioritised considering its severe degradation
  compared with all other PGs?

I have already increased these crush tunables and executed a pg repeer to no avail:

tunable choose_total_tries 250 <-- default 100
rule fs-data {
        id 1
        type erasure
        min_size 3
        max_size 6
        step set_chooseleaf_tries 50 <-- default 5
        step set_choose_tries 200 <-- default 100
        step take default
        step choose indep 0 type osd
        step emit
}

Ceph health detail says to that:

[WRN] PG_AVAILABILITY: Reduced data availability: 1 pg inactive
    pg 4.32 is stuck inactive for 37m, current state recovery_wait+undersized+degraded+remapped+peered, last acting [1,2147483647,2147483647,4,5,2]

I don't want to cheat and set min_size=k on this pool. It should work by itself.

Thanks for any pointers!
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux