Hi,
I assume you're still on a "low" pacific release? This was fixed by PR
[1][2] and the warning is supressed when autoscaler is on, it was
merged into Pacific 16.2.8 [3].
I can't answer why autoscaler doesn't increase the pg_num, but yes,
you can increase it by yourself. The pool for cephfs_metadata should
be on fast storage and doesn't have huge amounts of data so it should
be relatively quick. What's your 'ceph osd df' output?
Regards,
Eugen
[1] https://tracker.ceph.com/issues/53644
[2] https://github.com/ceph/ceph/pull/45152
[3] https://docs.ceph.com/en/latest/releases/pacific/#v16-2-8-pacific
Zitat von Edouard FAZENDA <e.fazenda@xxxxxxx>:
Hello Ceph community,
I have since this morning a warning about MANY_OBJECT_PER_PG on 1 pool which
is cephfs_metadata
# ceph health detail
HEALTH_WARN 1 pools have many more objects per pg than average
[WRN] MANY_OBJECTS_PER_PG: 1 pools have many more objects per pg than
average
pool cephfs_metadata objects per pg (154151) is more than 10.0215 times
cluster average (15382)
I have the autoscaling on on all the pool :
# ceph osd pool autoscale-status
POOL SIZE TARGET SIZE RATE RAW CAPACITY RATIO
TARGET RATIO EFFECTIVE RATIO BIAS PG_NUM NEW PG_NUM AUTOSCALE
device_health_metrics 9523k 3.0 26827G 0.0000
1.0 1 on
cephfs_data 5389G 2.0 26827G 0.4018
1.0 512 on
cephfs_metadata 19365M 2.0 26827G 0.0014
4.0 16 on
.rgw.root 1323 3.0 26827G 0.0000
1.0 32 on
default.rgw.log 23552 3.0 26827G 0.0000
1.0 32 on
default.rgw.control 0 3.0 26827G 0.0000
1.0 32 on
default.rgw.meta 11911 3.0 26827G 0.0000
4.0 8 on
default.rgw.buckets.index 0 3.0 26827G 0.0000
4.0 8 on
default.rgw.buckets.data 497.0G 3.0 26827G 0.0556
1.0 32 on
kubernetes 177.2G 2.0 26827G 0.0132
1.0 32 on
default.rgw.buckets.non-ec 432 3.0 26827G 0.0000
1.0 32 on
Actually the pg_num is 16 for the cephfs_metdata pool , but it does not
define NEW_PG_NUM
Here the replicated size of all my pool
# ceph osd dump | grep 'replicated size'
pool 1 'device_health_metrics' replicated size 3 min_size 2 crush_rule 0
object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 189372
flags hashpspool stripe_width 0 pg_num_min 1 application mgr_devicehealth
pool 10 'cephfs_data' replicated size 2 min_size 1 crush_rule 1 object_hash
rjenkins pg_num 512 pgp_num 512 autoscale_mode on last_change 189346 lfor
0/0/183690 flags hashpspool,selfmanaged_snaps stripe_width 0 application
cephfs
pool 11 'cephfs_metadata' replicated size 2 min_size 1 crush_rule 1
object_hash rjenkins pg_num 16 pgp_num 16 autoscale_mode on last_change
187861 lfor 0/187861/187859 flags hashpspool stripe_width 0
pg_autoscale_bias 4 pg_num_min 16 recovery_priority 5 application cephfs
pool 18 '.rgw.root' replicated size 3 min_size 2 crush_rule 0 object_hash
rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 5265 flags
hashpspool stripe_width 0 application rgw
pool 19 'default.rgw.log' replicated size 3 min_size 2 crush_rule 0
object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 5267
flags hashpspool stripe_width 0 application rgw
pool 20 'default.rgw.control' replicated size 3 min_size 2 crush_rule 0
object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 5269
flags hashpspool stripe_width 0 application rgw
pool 21 'default.rgw.meta' replicated size 3 min_size 2 crush_rule 0
object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode on last_change 5398
lfor 0/5398/5396 flags hashpspool stripe_width 0 pg_autoscale_bias 4
pg_num_min 8 application rgw
pool 22 'default.rgw.buckets.index' replicated size 3 min_size 2 crush_rule
0 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode on last_change 7491
lfor 0/7491/7489 flags hashpspool stripe_width 0 pg_autoscale_bias 4
pg_num_min 8 application rgw
pool 23 'default.rgw.buckets.data' replicated size 3 min_size 2 crush_rule 0
object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 7500
flags hashpspool stripe_width 0 application rgw
pool 24 'kubernetes' replicated size 2 min_size 1 crush_rule 1 object_hash
rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 189363 lfor
0/0/7560 flags hashpspool,selfmanaged_snaps stripe_width 0 application rbd
pool 25 'default.rgw.buckets.non-ec' replicated size 3 min_size 2 crush_rule
0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change
23983 flags hashpspool stripe_width 0 application rgw
Why the autoscaler is not acting to increase the pg_num of the pool in
warning ?
As the pgcalc is not available on ceph now , do you thing it is a good idea
to increase manually the pg_num of the cephfs_metadata , but which value
should I set ?
I have 18 OSD
Thanks for the help
Best Regards,
Edouard FAZENDA
Technical Support
Chemin du Curé-Desclouds 2, CH-1226 THONEX +41 (0)22 869 04 40
<https://www.csti.ch/> www.csti.ch
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx