Re: Remove Error - "Possible data damage: 2 pgs recovery_unfound"

Jonathan Sélea <jonathan@xxxxxxxx> · Fri, 21 Aug 2020 12:03:48 +0200

Hi everyone,
I just wanted to ask again about your oppinion about this "problem" that 
I have.
Thankful for any answer!

On 2020-08-19 13:39, Jonathan Sélea wrote:
Good afternoon!

I have a small Ceph-cluster running with Proxmox, and after an update
on one of the nodes and a reboot. So far so good.
But after a couple of hours, I saw this:

root@pve2:~# ceph health detail
HEALTH_ERR 16/1101836 objects unfound (0.001%); Possible data damage:
2 pgs recovery_unfound; Degraded data redundancy: 48/3305508 objects
degraded (0.001%), 2 pgs degraded, 2 pgs undersized
OBJECT_UNFOUND 16/1101836 objects unfound (0.001%)
    pg 1.37 has 6 unfound objects
    pg 1.48 has 10 unfound objects
PG_DAMAGED Possible data damage: 2 pgs recovery_unfound
    pg 1.37 is active+recovery_unfound+undersized+degraded+remapped,
acting [11,17], 6 unfound
    pg 1.48 is active+recovery_unfound+undersized+degraded+remapped,
acting [5,11], 10 unfound
PG_DEGRADED Degraded data redundancy: 48/3305508 objects degraded
(0.001%), 2 pgs degraded, 2 pgs undersized
    pg 1.37 is stuck undersized for 446774.454853, current state
active+recovery_unfound+undersized+degraded+remapped, last acting
[11,17]
    pg 1.48 is stuck undersized for 446774.459466, current state
active+recovery_unfound+undersized+degraded+remapped, last acting
[5,11]

root@pve2:~# ceph -s
  cluster:
    id:     76e70c34-bce9-4f86-b049-0054f21c3494
    health: HEALTH_ERR
            16/1101836 objects unfound (0.001%)
            Possible data damage: 2 pgs recovery_unfound
            Degraded data redundancy: 48/3305508 objects degraded
(0.001%), 2 pgs degraded, 2 pgs undersized

  services:
    mon: 3 daemons, quorum pve3,pve1,pve2 (age 2w)
    mgr: pve3(active, since 2w), standbys: pve1, pve2
    mds: cephfs:1 {0=pve1=up:active} 2 up:standby
    osd: 25 osds: 25 up (since 5d), 25 in (since 8d); 2 remapped pgs

  data:
    pools:   4 pools, 672 pgs
    objects: 1.10M objects, 2.9 TiB
    usage:   8.6 TiB used, 12 TiB / 21 TiB avail
    pgs:     48/3305508 objects degraded (0.001%)
             16/1101836 objects unfound (0.001%)
             669 active+clean
             2   active+recovery_unfound+undersized+degraded+remapped
             1   active+clean+scrubbing+deep

  io:
    client:   680 B/s rd, 2.6 MiB/s wr, 0 op/s rd, 151 op/s wr

I am not really concerned over lost data, since I am 99% sure it
belonged to a faulty prometheus server anyway.
The question is, how can I remove the warnings without affecting the
other objects?

Thankful for any pointers!
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx