In my test environment I changed the reweights of an osd. After this some PGs get stucked in 'active+remapped' state. I can only repair it by stepping back to the old value of the reweight. Here is my ceph tree: > # id weight type name up/down reweight > -1 12 root default > -4 12 room serverroom > -2 12 host test1 > 0 2 osd.0 up 0.7439 > 1 2 osd.1 up 0.9 > 2 4 osd.2 up 1 > 3 4 osd.3 up 1 > -3 0 host test2 I changed osd.1 from 1.0 to 0.9 and then this happened: > :# ceph health detail > HEALTH_WARN 10 pgs stuck unclean; recovery 94/2976 objects misplaced (3.159%) > pg 6.4 is stuck unclean for 1135.549938, current state active+remapped, last acting [1,2,3] > [...] ceph pg dump shows the primary OSD of PG 6.4 as not existent (MAXINT). I do not have any idea what happend here. Pool 6 is an erasure coded pool (k=2, m=1). Here is the last part of the query output from the first PG 6.4: > :# ceph pg 6.4 query > [...] > "recovery_state": [ > { "name": "Started\/Primary\/Active", > "enter_time": "2015-01-03 17:16:10.054846", > "might_have_unfound": [], > "recovery_progress": { "backfill_targets": [], > "waiting_on_backfill": [], > "last_backfill_started": "0\/\/0\/\/-1", > "backfill_info": { "begin": "0\/\/0\/\/-1", > "end": "0\/\/0\/\/-1", > "objects": []}, > "peer_backfill_info": [], > "backfills_in_flight": [], > "recovering": [], > "pg_backend": { "recovery_ops": [], > "read_ops": []}}, > "scrub": { "scrubber.epoch_start": "0", > "scrubber.active": 0, > "scrubber.block_writes": 0, > "scrubber.waiting_on": 0, > "scrubber.waiting_on_whom": []}}, > { "name": "Started", > "enter_time": "2015-01-03 17:16:09.073069"}], Any idea what happened or have I done anything wrong here? Greetings! _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com