Re: Increasing PG number

<tom.byrne@xxxxxxxxxx> · Wed, 3 Jan 2018 11:04:22 +0000

Last summer we increased an EC 8+3 pool from 1024 to 2048 PGs on our ~1500 OSD (Kraken) cluster. This pool contained ~2 petabytes of
 data at the time.

We did a fair amount of testing on a throwaway pool on the same cluster beforehand, starting with small increases (16/32/64).

The main observation was that the act of splitting the PGs causes issues, not the resulting data movement, assuming your backfills
 are tuned to a level where they don’t affect client IO. 

As the PG splitting and peering (pg_num and pgp_num) increases are a) non reversible and b) the resulting operations happen instantaneously,
 overly large increases can end up with an unhappy mess of excessive storage node load, OSDs flapping and blocked requests.

We ended up doing increases of 128 PGs at a time.

I’d hazard a guess that you will be fine going straight to 512 PGs, but the only way to be sure of the correct increase size for your
 cluster is to test it.

Cheers
Tom

From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx]
On Behalf Of Karun Josy

Sent: 02 January 2018 16:23

To: Hans van den Bogert <hansbogert@xxxxxxxxx>

Cc: ceph-users <ceph-users@xxxxxxxxxxxxxx>

Subject: Re: [ceph-users] Increasing PG number

https://access.redhat.com/solutions/2457321

It says it is a very intensive process and can affect cluster performance.

Our Version is Luminous 12.2.2

And we are using erasure coding profile for a pool 'ecpool' with k=5 and m=3

Current PG number is 256 and it has about 20 TB of data.

Should I increase it gradually? Or set pg as 512 in one step ?

Karun Josy

On Tue, Jan 2, 2018 at 9:26 PM, Hans van den Bogert <hansbogert@xxxxxxxxx> wrote:

Please refer to standard documentation as much as possible, 

    http://docs.ceph.com/docs/jewel/rados/operations/placement-groups/#set-the-number-of-placement-groups

Han’s is also incomplete, since you also need to change the ‘pgp_num’ as well.

Regards,

Hans

On Jan 2, 2018, at 4:41 PM, Vladimir Prokofev <v@xxxxxxxxxxx> wrote:

Increased number of PGs in multiple pools in a production cluster on 12.2.2 recently - zero issues.

CEPH claims that increasing pg_num and pgp_num are safe operations, which are essential for it's ability to scale, and this sounds pretty reasonable to me. [1]

[1] https://www.sebastien-han.fr/blog/2013/03/12/ceph-change-pg-number-on-the-fly/

2018-01-02 18:21 GMT+03:00 Karun Josy <karunjosy1@xxxxxxxxx>:

Hi,

 Initial PG count was not properly planned while setting up the cluster, so now there are only less than 50 PGs per OSDs.

What are the best practises to increase PG number of a pool ?

We have replicated pools as well as EC pools.

Or is it better to create a new pool with higher PG numbers?

Karun 

_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com