Re: Is it safe to increase pg number in a production environment

Marek Dohojda <mdohojda@xxxxxxxxxxxxxxxxxxx> · Wed, 5 Aug 2015 10:04:00 -0600

I started with 7 and expended it to 14 with starting PG of 512 to 4096, as recommended. 

Unfortunately I can’t tell you the exact IO impact as I’ve done my changes in the off hours where the impact wasn’t important, I could see reduction in performance but since it had no impact on me I didn’t messure it exactly.

Since I had the luxury of leaving it going overnight I didn’t step the PG.  I would, however, highly recommend in normal circumstances to do this in stages to reduce the impact you will see.  

You can see very significant IO load, and CPU time during the operation.  The realoctation in my case took over an hour to accomplish.

On Aug 4, 2015, at 7:43 PM, Jevon Qiao <qiaojianfeng@xxxxxxxxxxxxxxx> wrote:

Thank you and Samuel for the prompt response.
On 5/8/15 00:52, Marek Dohojda wrote:
I have done this not that long ago.  My original PG estimates were wrong and I had to increase them.

After increasing the PG numbers the Ceph rebalanced, and that took a while.  To be honest in my case the slowdown wasn’t really visible, but it took a while.
How many OSDs do you have in your cluster? How much did you adjust the PG numbers?
My strong suggestion to you would be to do it in a long IO time, and be prepared that this willl take quite a long time to accomplish.  Do it slowly  and do not increase multiple pools at once.
Both you and Samuel said to do it slowly, do you mean to adjust the pg numbers step by step rather than doing it in one step? Also, would you please explain 'a long IO time' in details.

Thanks,
Jevon
It isn’t recommended practice but doable.

On Aug 4, 2015, at 10:46 AM, Samuel Just <sjust@xxxxxxxxxx> wrote:

It will cause a large amount of data movement.  Each new pg after the
split will relocate.  It might be ok if you do it slowly.  Experiment
on a test cluster.
-Sam

On Mon, Aug 3, 2015 at 12:57 AM, 乔建峰 <scaleqiao@xxxxxxxxx> wrote:
Hi Cephers,

This is a greeting from Jevon. Currently, I'm experiencing an issue which
suffers me a lot, so I'm writing to ask for your comments/help/suggestions.
More details are provided bellow.

Issue:
I set up a cluster having 24 OSDs and created one pool with 1024 placement
groups on it for a small startup company. The number 1024 was calculated per
the equation 'OSDs * 100'/pool size. The cluster have been running quite
well for a long time. But recently, our monitoring system always complains
that some disks' usage exceed 85%. I log into the system and find out that
some disks' usage are really very high, but some are not(less than 60%).
Each time when the issue happens, I have to manually re-balance the
distribution. This is a short-term solution, I'm not willing to do it all
the time.

Two long-term solutions come in my mind,
1) Ask the customers to expand their clusters by adding more OSDs. But I
think they will ask me to explain the reason of the imbalance data
distribution. We've already done some analysis on the environment, we
learned that the most imbalance part in the CRUSH is the mapping between
object and pg. The biggest pg has 613 objects, while the smallest pg only
has 226 objects.

2) Increase the number of placement groups. It can be of great help for
statistically uniform data distribution, but it can also incur significant
data movement as PGs are effective being split. I just cannot do it in our
customers' environment before we 100% understand the consequence. So anyone
did this under a production environment? How much does this operation affect
the performance of Clients?

Any comments/help/suggestions will be highly appreciated.

--
Best Regards
Jevon

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.htmlml

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com