Re: HEALTH_WARN 1 pools have too few placement groups

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This was a bug in 14.2.7 and calculation for EC pools.



It has been fixed in 14.2.8





---- On Mon, 16 Mar 2020 16:21:41 +0800 Dietmar Rieder <dietmar.rieder@xxxxxxxxxxx> wrote ----



Hi, 
 
I was planing to activate the pg_autoscaler on a EC (6+3) pool which I 
created two years ago. 
Back then I calculated the total # of pgs for this pool with a target 
per ods pg # of 150 (this was the recommended /osd pg number as far as I 
recall). 
 
I used the RedHat ceph pg per pool calculator [1] with the following 
settings: 
PoolyType: EC 
K: 6 
M: 3 
OSD #: 220 
%Data: 100.00 
Target PGs per OSD: 150 
 
This resulted into a suggested PG count of 4096, which I used to create 
that pool. 
(BTW: I get the same using the ceph.io pgcalc [2], setting size to 9 
(=EC 6+3)) 
 
# ceph osd df shows that I have now 162-169 pgs / osd 
 
Now I enabled the pg_autoscaler and set autoscale_mode to "warn", using 
the following commands: 
 
# ceph mgr module  enable pg_autoscaler 
# ceph osd pool set hdd-ec-data-pool pg_autoscale_mode warn 
 
when i set the target_ratio_size to 1, using: 
 
# ceph osd pool set hdd-ec-data-pool target_size_ratio 1 
 
I get the warning: 
1 pools have too few placement groups 
 
The autoscale status shows that pg number in the EC pool would get 
scaled up to 16384 pgs: 
 
# ceph osd pool autoscale-status 
POOL                    SIZE TARGET SIZE RATE RAW CAPACITY  RATIO TARGET 
RATIO BIAS PG_NUM NEW PG_NUM AUTOSCALE 
ssd-rep-metadata-pool 11146M              3.0       29805G 0.0011 
 1.0     64          4 off 
ssd-rep-data-pool     837.8G              3.0       29805G 0.0843 
 1.0   1024         64 off 
hdd-ec-data-pool      391.3T              1.5        1614T 0.3636 
1.0000  1.0   4096      16384 warn 
 
How does this relate the the calculated number of 4096 pgs for that EC 
pool? I mean 16384 is quite a jump from the original value. 
Is the autoscaler miscalculating something or is the pg calculator wrong? 
 
Can someone explain it? (My cluster is on ceph version 14.2.7) 
 
 
 
Thanks so much 
 Dietmar 
 
[1]: https://access.redhat.com/labs/cephpgc/ 
[2]: https://ceph.io/pgcalc/ 
 
 
-- 
_________________________________________ 
D i e t m a r  R i e d e r, Mag.Dr. 
Innsbruck Medical University 
Biocenter - Institute of Bioinformatics 
Innrain 80, 6020 Innsbruck 
Email: mailto:dietmar.rieder@xxxxxxxxxxx 
Web: http://www.icbi.at 
_______________________________________________ 
ceph-users mailing list -- mailto:ceph-users@xxxxxxx 
To unsubscribe send an email to mailto:ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux