I have done this not that long ago. My original PG estimates were wrong and I had to increase them. After increasing the PG numbers the Ceph rebalanced, and that took a while. To be honest in my case the slowdown wasn’t really visible, but it took a while. My strong suggestion to you would be to do it in a long IO time, and be prepared that this willl take quite a long time to accomplish. Do it slowly and do not increase multiple pools at once. It isn’t recommended practice but doable. > On Aug 4, 2015, at 10:46 AM, Samuel Just <sjust@xxxxxxxxxx> wrote: > > It will cause a large amount of data movement. Each new pg after the > split will relocate. It might be ok if you do it slowly. Experiment > on a test cluster. > -Sam > > On Mon, Aug 3, 2015 at 12:57 AM, 乔建峰 <scaleqiao@xxxxxxxxx> wrote: >> Hi Cephers, >> >> This is a greeting from Jevon. Currently, I'm experiencing an issue which >> suffers me a lot, so I'm writing to ask for your comments/help/suggestions. >> More details are provided bellow. >> >> Issue: >> I set up a cluster having 24 OSDs and created one pool with 1024 placement >> groups on it for a small startup company. The number 1024 was calculated per >> the equation 'OSDs * 100'/pool size. The cluster have been running quite >> well for a long time. But recently, our monitoring system always complains >> that some disks' usage exceed 85%. I log into the system and find out that >> some disks' usage are really very high, but some are not(less than 60%). >> Each time when the issue happens, I have to manually re-balance the >> distribution. This is a short-term solution, I'm not willing to do it all >> the time. >> >> Two long-term solutions come in my mind, >> 1) Ask the customers to expand their clusters by adding more OSDs. But I >> think they will ask me to explain the reason of the imbalance data >> distribution. We've already done some analysis on the environment, we >> learned that the most imbalance part in the CRUSH is the mapping between >> object and pg. The biggest pg has 613 objects, while the smallest pg only >> has 226 objects. >> >> 2) Increase the number of placement groups. It can be of great help for >> statistically uniform data distribution, but it can also incur significant >> data movement as PGs are effective being split. I just cannot do it in our >> customers' environment before we 100% understand the consequence. So anyone >> did this under a production environment? How much does this operation affect >> the performance of Clients? >> >> Any comments/help/suggestions will be highly appreciated. >> >> -- >> Best Regards >> Jevon >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com