Hi, We have production platform of Ceph in our farm of Openstack. This platform have following specs: 1 Admin Node 3 Monitors 7 Ceph Nodes with 160 OSD of SAS HDD 1.2TB 10K. Maybe 30 OSD have Journal SSD...we are in update progress...;-)
All network have 10GB Ethernet link and we have some problems now of slow and block request...no problem we are diagnosing the platform. One of our problems are the placement groups, this number has not been changed by mistake for a long time...our pools: GLOBAL: SIZE AVAIL RAW USED %RAW USED OBJECTS
173T 56319G 118T 68.36 10945k
POOLS: NAME ID CATEGORY USED %USED MAX AVAIL OBJECTS DIRTY READ WRITE
rbd 0 - 0 0 14992G 0 0 1 66120
volumes 6 - 42281G 23.75 14992G 8871636 8663k 47690M 55474M
images 7 - 18151G 10.20 14992G 2324108 2269k 1456M 1622k
backups 8 - 0 0 14992G 1 1 18578 104k
vms 9 - 91575M 0.05 14992G 12827 12827 2526k 6863k And our PG on pools are (only of used pools): Volumes 2048 Images 1024 We think that our performance problem, after verify network, servers, hardware, disk, software, bugs, logs, etc...is the number of PG volumes... Our Question: How we can update de pg number and after pgp number in production environment without interrupting service, poor performance or down the virtual instances...??? The last update was made from 512 to 1024 in the pool of pictures and had a drop service 2 hours because the platform did not support data traffic....we are scaried :-(
We can do this change with little increments in two weeks? How? Thanks Thanks Thanks
_________________________________________________________________________ Emilio Moreno Fernández |
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com