On Mon, Aug 3, 2015 at 4:05 PM, 乔建峰 <scaleqiao@xxxxxxxxx> wrote: > [Including ceph-users alias] > > 2015-08-03 16:01 GMT+08:00 乔建峰 <scaleqiao@xxxxxxxxx>: >> >> Hi Cephers, >> >> Currently, I'm experiencing an issue which suffers me a lot, so I'm >> writing to ask for your comments/help/suggestions. More details are provided >> bellow. >> >> Issue: >> I set up a cluster having 24 OSDs and created one pool with 1024 placement >> groups on it for a small startup company. The number 1024 was calculated per >> the equation (OSDs * 100)/pool size. The cluster have been running quite >> well for a long time. But recently, our monitoring system always complains >> that some disks' usage exceed 85%. I log into the system and find out that >> some disks' usage are really very high, but some are not(less than 60%). >> Each time when the issue happens, I have to manually re-balance the >> distribution. This is a short-term solution, I'm not willing to do it all >> the time. >> >> Two long-term solutions come in my mind, >> 1) Ask the customers to expand their clusters by adding more OSDs. But I >> think they will ask me to explain the reason of the imbalance data >> distribution. We've already done some analysis on the environment, we >> learned that the most imbalance part in the CRUSH is the mapping between >> object and pg. The biggest pg has 613 objects, while the smallest pg only >> has 226 objects. >> >> 2) Increase the number of placement groups. It can be of great help for >> statistically uniform data distribution, but it can also incur significant >> data movement as PGs are effective being split. I just cannot do it in our >> customers' environment before we 100% understand the consequence. So anyone >> did this under a production environment? How much does this operation affect >> the performance of Clients? >> >> Any comments/help/suggestions will be highly appreciated. Of course not, pg split isn't a recommend process for running cluster. It will block the client IO totally. Instead of recovering process which will make object level control, split is a pg-level process and osd itself can't control it smoothly. In theory if we need to make pg split work at real cluster, we need to do more things at MON and lots of logic will make trouble. Although we can't enjoy the flexible via pg split, we can get the same result from *pool* with a little user management logics. "pool" is good thing which can cover your need. Most users always like to have one pool for the whole cluster, it's fine for immutable cluster but not good for a flexible cluster I think. For example, if double osd nodes, create a new pool is a better way than preparing a pool with lots of pgs at a very beginning. If using openstack, cloudstack or else, these cloud projects can provide with upper control with "volume_type". In a word, we can enjoy increasing osds with a relatively small account. But I think we can't feel free to double the ceph cluster and hoping ceph could do it perfectly. >> >> -- >> Best Regards >> Jevon > > > > > -- > Best Regards > Jevon > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Best Regards, Wheat _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com