Hello Cephers,
I'm trying to diagnose who's doing what on our cluster, which suffer
from SLOW_OPS, High latency periods since Pacific.
And I can't see all pool / images in RBD stats.
I had activated RBD image stats while running Octopus, now it seems we
only need to define mgr/prometheus/rbd_stats_pools.
I have put '*' to catch all pools.
First question: even specifying explicitly an EC data pool, it doesn't
seem to have stats.
I can understand that image stats would be collected at metadata pool.
Is it correct ?
But, second question: I can only see 3 pools in Prometheus metrics like
ceph_rbd_read_ops (among ~20, I use OpenStack with all its pools).
So, either in the Dashboard graphs or in my Grafana, I can only see
metrics concerning these pools.
Mmm, I'm just seeing one thing... I have no image in the other pools...
Gnocchi does not store images, my cinder-backup pool is empty, my second
cinder pool also,
And finally, all radosgw pools are not storing rbd images too...
So I think I have my answer to that second question.
Anyway, it's strange that I can't find the same value comparing the pool
statistics with the sum of the RBD image in it :
sum(irate(ceph_rbd_write_bytes{cluster="mycluster",pool="myvolumepool"}[1m]))
irate(ceph_pool_wr_bytes{cluster="mycluster",pool_id="myvolumedatapoolid"}[1m])
There's more than 10 times ceph_pool_wr_bytes on the datapool than the
sum of all ceph_rbd_write_bytes on the metadata pool.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx