Re: Weird PG Acting Set

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi  Weiwen,

Yes it is EC 4+2 pool. Should I do "osd out" first to affected OSDs before
doing the procedure you mentioned? Do you mean to down the affected OSD one
by one, doing the procedure and then bring it up again? If I make all of
them down again, I'm afraid that this will impact to other PGs which has
the same OSDs members. Would you mind to give me safe step by step? I don't
mind to lost this PG since it is the risk, but I need no I/O freeze
whenever doing recovery on the RBD images which consists object inside this
PG where this pool is a RBD data-pool.

Best regards,

On Fri, May 7, 2021 at 10:17 PM 胡玮文 <huww98@xxxxxxxxxxx> wrote:

> 在 2021/5/7 下午6:46, Lazuardi Nasution 写道:
>
> Hi,
>
> After recreating some related OSDs (3, 71 and 237), now the acting set is
> normal but the PG is incomplete now and there are slow ops on primary OSD
> (3). I have tried to make it normal
>
> Hi Lazuardi,
>
> I assume this pg is in a EC 4+2 pool, so you can lost at most 2 OSDs? Now
> you have wiped the data of 3 OSDs, this pg does not have enough information
> to recover the content in it.
>
> If it is the case, I guess unless you can recover the data in some of
> these recreated OSDs, you cannot recover the content in this pg. Your best
> choice may be deleting all objects in it (with "ceph-objectstore-tool
> --op remove", then "ceph osd force-create-pg", I believe). Be aware of
> the data loss.
>
> Weiwen Hu
>
> with osd_find_best_info_ignore_history_les way but the PG is still
> incomplete. On this condition the I/O from clients sometimes is freezing, I
> suspect that the blocks inside this PG cause I/O freeze. How can I resolve
> this incomplete PG or at least to make the client I/O not freeze for
> recovering the rest of the normal block like recovering the drive with bad
> sectors?
>
> Best regards,
>
> On Wed, May 5, 2021 at 12:29 AM Lazuardi Nasution <mrxlazuardin@xxxxxxxxx> <mrxlazuardin@xxxxxxxxx>
> wrote:
>
>
> Hi,
>
> Suddenly we have a recovery_unfound situation. I find that PG acting set
> is missing some OSDs which are up. Why can't OSD 3 and 71 on following PG
> query result be members of PG acting set? Currently, we use v15.2.8. How to
> recover from this situation?
>
> {
>     "snap_trimq": "[]",
>     "snap_trimq_len": 0,
>     "state":
> "active+forced_recovery+recovery_unfound+undersized+degraded+remapped",
>     "epoch": 237505,
>     "up": [
>         3,
>         237,
>         71,
>         132,
>         115,
>         56
>     ],
>     "acting": [
>         2147483647,
>         237,
>         2147483647,
>         132,
>         115,
>         56
>     ],
>     "backfill_targets": [
>         "3(0)",
>         "71(2)"
>     ],
>     "acting_recovery_backfill": [
>         "3(0)",
>         "56(5)",
>         "71(2)",
>         "115(4)",
>         "132(3)",
>         "237(1)"
>     ],
>
> Best regards.
>
>
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux