Re: dealing with unfound pg in 4:2 ec pool

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I'm not sure if setting min_size to 4 would also fix the PGs, but the client IO would probably be restored. Marking it as lost is the last straw according to this list, luckily I haven't been in such a situation yet. So give it a try with min_size = 4 but don't forget to increase after the PGs are recovered. But keep in mind that if you decrease min_size and you lose another OSD you could face data loss. Are your OSDs still crashing unexpected?



Zitat von "Szabo, Istvan (Agoda)" <Istvan.Szabo@xxxxxxxxx>:

Hi,

If I set the min size of the pool to 4, will this pg be recovered? Or how I can take out the cluster from health error like this? Mark as lost seems risky based on some maillist experience, even if marked lost after you still have issue, so curious what is the way to take the cluster out from this and let it recover:

Example problematic pg:
dumped pgs_brief
PG_STAT STATE UP UP_PRIMARY ACTING ACTING_PRIMARY 28.5b active+recovery_unfound+undersized+degraded+remapped [18,33,10,0,48,1] 18 [2147483647,2147483647,29,21,4,47] 29

Cluster state:
  cluster:
    id:     5a07ec50-4eee-4336-aa11-46ca76edcc24
    health: HEALTH_ERR
            10 OSD(s) experiencing BlueFS spillover
            4/1055070542 objects unfound (0.000%)
            noout flag(s) set
            Possible data damage: 2 pgs recovery_unfound
Degraded data redundancy: 64150765/6329079237 objects degraded (1.014%), 10 pgs degraded, 26 pgs undersized
            4 pgs not deep-scrubbed in time

  services:
    mon: 3 daemons, quorum mon-2s01,mon-2s02,mon-2s03 (age 2M)
    mgr: mon-2s01(active, since 2M), standbys: mon-2s03, mon-2s02
    osd: 49 osds: 49 up (since 36m), 49 in (since 4d); 28 remapped pgs
         flags noout
    rgw: 3 daemons active (mon-2s01.rgw0, mon-2s02.rgw0, mon-2s03.rgw0)

  task status:

  data:
    pools:   9 pools, 425 pgs
    objects: 1.06G objects, 66 TiB
    usage:   158 TiB used, 465 TiB / 623 TiB avail
    pgs:     64150765/6329079237 objects degraded (1.014%)
             38922319/6329079237 objects misplaced (0.615%)
             4/1055070542 objects unfound (0.000%)
             393 active+clean
             13  active+undersized+remapped+backfill_wait
             8   active+undersized+degraded+remapped+backfill_wait
             3   active+clean+scrubbing
             3   active+undersized+remapped+backfilling
             2   active+recovery_unfound+undersized+degraded+remapped
             2   active+remapped+backfill_wait
             1   active+clean+scrubbing+deep

  io:
    client:   181 MiB/s rd, 9.4 MiB/s wr, 5.38k op/s rd, 2.42k op/s wr
    recovery: 23 MiB/s, 389 objects/s


Thank you.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux