Re: objects misplaced jumps up at 5%

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Paul,

I think you found the answer!

When adding 100 new OSDs to the cluster, I increased both pg and pgp
from 4096 to 16,384

**********************************
[root@ceph1 ~]# ceph osd pool set ec82pool pg_num 16384
set pool 5 pg_num to 16384

[root@ceph1 ~]# ceph osd pool set ec82pool pgp_num 16384
set pool 5 pgp_num to 16384

**********************************

The pg number increased immediately as seen with "ceph -s"

But unknown to me, the pgp number did not increase immediately.

"ceph osd pool ls detail" shows that pgp is currently 11412

Each time we hit 5.000% misplaced, the pgp number increases by 1 or 2,
this causes the % misplaced to increase again to ~5.1%
... which is why we thought the cluster was not re-balancing.


If I'd looked at the ceph.audit.log there are entries like this:

2020-09-23 01:13:11.564384 mon.ceph3b (mon.1) 50747 : audit [INF]
from='mgr.90414409 10.1.0.80:0/7898' entity='mgr.ceph2' cmd=[{"prefix":
"osd pool set", "pool": "ec82pool", "var": "pgp_num_actual", "val":
"5076"}]: dispatch
2020-09-23 01:13:11.565598 mon.ceph1b (mon.0) 85947 : audit [INF]
from='mgr.90414409 ' entity='mgr.ceph2' cmd=[{"prefix": "osd pool set",
"pool": "ec82pool", "var": "pgp_num_actual", "val": "5076"}]: dispatch
2020-09-23 01:13:12.530584 mon.ceph1b (mon.0) 85949 : audit [INF]
from='mgr.90414409 ' entity='mgr.ceph2' cmd='[{"prefix": "osd pool set",
"pool": "ec82pool", "var": "pgp_num_actual", "val": "5076"}]': finished


Our assumption is that the pgp number will continue to increase till it
reaches its set level, at which point the cluster will complete it's
re-balance...

again, many thanks to you both for your help,

Jake

On 28/09/2020 17:35, Paul Emmerich wrote:
> Hi,
> 
> 5% misplaced is the default target ratio for misplaced PGs when any
> automated rebalancing happens, the sources for this are either the
> balancer or pg scaling.
> So I'd suspect that there's a PG change ongoing (either pg autoscaler or
> a manual change, both obey the target misplaced ratio).
> You can check this by running "ceph osd pool ls detail" and check for
> the value of pg target.
> 
> Also: Looks like you've set osd_scrub_during_recovery = false, this
> setting can be annoying on large erasure-coded setups on HDDs that see
> long recovery times. It's better to get IO priorities right; search
> mailing list for osd op queue cut off high.
> 
> Paul

-- 
Dr Jake Grimmett
Head Of Scientific Computing
MRC Laboratory of Molecular Biology
Francis Crick Avenue,
Cambridge CB2 0QH, UK.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux