Strange placement groups warnings

"Dmitriy Maximov" <dmaximov@xxxxxxxxx> · Wed, 10 Apr 2024 19:20:58 -0000

Dear Ceph experts,

recently we have upgraded our ceph cluster from octopus (15.2.17) to pacific (16.2.14 and then to 16.2.15).

Just after upgrade warnings that all (except device_health_metrics pool) our pools have too many placement groups appeared.

This warning looks like generated by autoscaler, but there was no suggestions from it.

When I started to investigate situation, I turned bulk flag to true for pools intended to store large amount of data (just to look changes).

After that most pools reports that they have too few placement groups, but there is still no suggestions from autoscaler (NEW PG_NUM column is empty)

# ceph osd pool autoscale-status

POOL                     SIZE  TARGET SIZE                RATE  RAW CAPACITY   RATIO  TARGET RATIO  EFFECTIVE RATIO  BIAS  PG_NUM  NEW PG_NUM  AUTOSCALE  BULK    
metadata               22567M                              3.0        57255G  0.0012                                  1.0      32              warn       False   
rbd_hdd02_ec_meta      20817k                              3.0        365.3T  0.0000                                  1.0      32              warn       False   
device_health_metrics   1002M                              3.0        205.4T  0.0000                                  1.0       1              on         False   
rbd_ssd_ec_meta            0                               3.0        57255G  0.0000                                  1.0      32              warn       False   
cmd_spool              45094G               1.3333333730697632        205.4T  0.2859                                  1.0     512              warn       True    
data                   60496G                              3.0        365.3T  0.4851                                  1.0     512              warn       True    
rbd                     2625G                              3.0        205.4T  0.0374                                  1.0      64              warn       True    
kedr_spool              6900G       10240G  1.3333333730697632        205.4T  0.0649                                  1.0     128              warn       True    
gcf_spool               1119M               1.3333333730697632        205.4T  0.0000                                  1.0      32              warn       True    
rbd_ssd                107.8G                              3.0        57255G  0.0056                                  1.0      32              warn       True    
data_ssd               775.8G                              3.0        57255G  0.0407                                  1.0     128              warn       True   

only warnings in 'ceph status' and 'ceph health detail'

# ceph health detail

HEALTH_WARN 6 pools have too few placement groups; 4 pools have too many placement groups
[WRN] POOL_TOO_FEW_PGS: 6 pools have too few placement groups
   Pool data has 512 placement groups, should have 2048
   Pool rbd has 64 placement groups, should have 512
   Pool kedr_spool has 128 placement groups, should have 256
   Pool gcf_spool has 32 placement groups, should have 256
   Pool rbd_ssd has 32 placement groups, should have 2048
   Pool data_ssd has 128 placement groups, should have 2048
[WRN] POOL_TOO_MANY_PGS: 4 pools have too many placement groups
   Pool metadata has 32 placement groups, should have 32
   Pool rbd_hdd02_ec_meta has 32 placement groups, should have 32
   Pool rbd_ssd_ec_meta has 32 placement groups, should have 32
   Pool cmd_spool has 512 placement groups, should have 512

When I changed number of placement groups of pool kedr_spool from 128 to 256 as suggested in warning, data rebalanced successfully,

but now pool warns that it have too many placement groups:

   Pool kedr_spool has 256 placement groups, should have 256.

The most strange for me in these warnings is that the current number of PGs is the same that should be.

What is it and what how to solve these warnings?

Best regards,
Dmitriy
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx