Re: Investigating active+remapped+wait_backfill pg status

Ivan Grcic <igrcic@xxxxxxxxx> · Wed, 5 Oct 2016 02:40:41 +0200

I've just seen that send_pg_creates command is obsolete and has already  been removed in 6cbdd6750cf330047d52817b9ee9af31a7d318ae
So I guess it doesn't do too much :)

Tnx,
Ivan

On Wed, Oct 5, 2016 at 2:37 AM, Ivan Grcic <ivan.grcic@xxxxxxxxx> wrote:
Hi everybody,

I am trying to understand why am I keep on getting remapped+wait_backfill pg statuses, when doing some cluster pg shuffling. Sometimes it happens just by doing small reweight-by-utilization operation, and sometimes when I modify the crushmap (bigger movement of data).

Taking look at ceph health detail and investigating some of the pgs by using pg #ipg_id query, I can see that all of the "acting" pgs are healthy and of the same size.  "Up" pgs do have a pg folder created, but dont have any data inside (empty head + TEMP).

I dont have any (near)full pgs, and ceph pg debug  unfound_objects_exist yields FALSE.

Cluster is also 100% functional(but in WARN state), and I can see that if I write some data, acting pgs are all happily syncing between each other. 

Acting pgs

Up pgs

I can simply recover form this following this steps:
set  noout, norecover, norebalance to avoid unnecessary data movement
stopping all actingbackfill pgs (active + up) at the same time
remove empty "up" pgs
start all the pgs again
unset noout, norecover, norebalance
After that new "up" pgs are recreated in remapped+backfilling state, and marked as active+clean after some time.

I have also tried to "kick the cluster in the head" with ceph pg send_pg_creates (as stated here https://www.mail-archive.com/ceph-devel@xxxxxxxxxxxxxxx/msg12287.html), but I get:

$ ceph pg send_pg_creates                                           
Error EINVAL: (22) Invalid argument

BTW What is send_pg_creates really supposed to do?

Does anyone have some hints is this occurring? 
Thank you,
Ivan

Jewel 12.2.2

size 3
min_size 2

#I havent been playing with tunables
# begin crush map
tunable choose_local_tries 0
tunable choose_local_fallback_tries 0
tunable choose_total_tries 50
tunable chooseleaf_descend_once 1
tunable chooseleaf_vary_r 1
tunable straw_calc_version 1

#standard ruleset 
rule replicated_ruleset {
        ruleset 0
        type replicated
        min_size 1
        max_size 10
        step take dc
        step chooseleaf firstn 0 type host
        step emit
}

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com