Hi Jon, can you reweight one OSD to default value and share outcome of "ceph osd df tree; ceph -s; ceph health detail" ?
Recently I was adding new node, 12x 4TB, one disk at a time and faced activating+remapped state for few hours.
Not sure but maybe that was caused by "osd_max_backfills" value and backfill awaiting PGs queue.
# ceph -s
cluster:
id: 1023c49f-3a10-42de-9f62-9b122db21e1e
health: HEALTH_WARN
noscrub,nodeep-scrub flag(s) set
1 nearfull osd(s)
19 pool(s) nearfull
33336982/289660233 objects misplaced (11.509%)
Reduced data availability: 29 pgs inactive
Degraded data redundancy: 788023/289660233 objects degraded (0.272%), 782 pgs unclean, 54 pgs degraded, 48 pgs undersized
services:
mon: 3 daemons, quorum mon1,mon2,mon3
mgr: mon2(active), standbys: mon3, mon1
osd: 120 osds: 120 up, 120 in; 779 remapped pgs
flags noscrub,nodeep-scrub
rgw: 3 daemons active
data:
pools: 19 pools, 3760 pgs
objects: 38285k objects, 146 TB
usage: 285 TB used, 150 TB / 436 TB avail
pgs: 0.771% pgs not active
788023/289660233 objects degraded (0.272%)
33336982/289660233 objects misplaced (11.509%)
2978 active+clean
646 active+remapped+backfill_wait
57 active+remapped+backfilling
27 active+undersized+degraded+remapped+backfill_wait
25 activating+remapped
17 active+undersized+degraded+remapped+backfilling
4 activating+undersized+degraded+remapped
3 active+recovery_wait+degraded
3 active+recovery_wait+degraded+remapped
io:
client: 2228 kB/s rd, 54831 kB/s wr, 539 op/s rd, 756 op/s wr
recovery: 1360 MB/s, 348 objects/s
Now all PGs are active+clean.
Regards
Jakub
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com