Hi,
I'm still seeing this issue during failure testing of a new Mimic 13.2.4
cluster. To reproduce:
- Working Mimic 13.2.4 cluster
- Pull a disk
- Wait for recovery to complete (i.e. back to HEALTH_OK)
- Remove the OSD with `ceph osd crush remove`
- See greater than 100% degraded objects while it recovers as below
It doesn't seem to do any harm, once recovery completes the cluster
returns to HEALTH_OK.
I can only find bug 21803 on the tracker that seems to cover this
behaviour which is marked as resolved.
Simon
cluster:
id: MY ID
health: HEALTH_WARN
709/58572 objects misplaced (1.210%)
Degraded data redundancy: 90094/58572 objects degraded
(153.818%), 49 pgs degraded, 51 pgs undersized
services:
mon: 3 daemons, quorum san2-mon1,san2-mon2,san2-mon3
mgr: san2-mon1(active), standbys: san2-mon2, san2-mon3
osd: 52 osds: 52 up, 52 in; 84 remapped pgs
data:
pools: 16 pools, 2016 pgs
objects: 19.52 k objects, 72 GiB
usage: 7.8 TiB used, 473 TiB / 481 TiB avail
pgs: 90094/58572 objects degraded (153.818%)
709/58572 objects misplaced (1.210%)
1932 active+clean
47 active+recovery_wait+undersized+degraded+remapped
33 active+remapped+backfill_wait
2 active+recovering+undersized+remapped
1 active+recovery_wait+undersized+degraded
1 active+recovering+undersized+degraded+remapped
io:
client: 24 KiB/s wr, 0 op/s rd, 3 op/s wr
recovery: 0 B/s, 126 objects/s
On 13/10/2017 18:53, David Zafman wrote:
I improved the code to compute degraded objects during
backfill/recovery. During my testing it wouldn't result in percentage
above 100%. I'll have to look at the code and verify that some
subsequent changes didn't break things.
David
On 10/13/17 9:55 AM, Florian Haas wrote:
Okay, in that case I've no idea. What was the timeline for the
recovery
versus the rados bench and cleanup versus the degraded object counts,
then?
1. Jewel deployment with filestore.
2. Upgrade to Luminous (including mgr deployment and "ceph osd
require-osd-release luminous"), still on filestore.
3. rados bench with subsequent cleanup.
4. All OSDs up, all PGs active+clean.
5. Stop one OSD. Remove from CRUSH, auth list, OSD map.
6. Reinitialize OSD with bluestore.
7. Start OSD, commencing backfill.
8. Degraded objects above 100%.
Please let me know if that information is useful. Thank you!
Hmm, that does leave me a little perplexed.
Yeah exactly, me too. :)
David, do we maybe do something with degraded counts based on the
number of
objects identified in pg logs? Or some other heuristic for number of
objects
that might be stale? That's the only way I can think of to get these
weird
returning sets.
One thing that just crossed my mind: would it make a difference
whether after the OSD goes out or not, in the time window between it
going down and being deleted from the crushmap/osdmap? I think it
shouldn't (whether being marked out or just non-existent, it's not
eligible for holding any data so either way), but I'm not really sure
about the mechanics of the internals here.
Cheers,
Florian
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com