Re: Expanding ceph cluster by adding more OSDs

Kyle Bader <kyle.bader@xxxxxxxxx> · Thu, 10 Oct 2013 05:15:27 -0700

I've contracted and expanded clusters by up to a rack of 216 OSDs - 18 nodes, 12 drives each.  New disks are configured with a CRUSH weight of 0 and I slowly add weight (0.1 to 0.01 increments), wait for the cluster to become active+clean and then add more weight. I was expanding after contraction so my PG count didn't need to be corrected, I tend to be liberal and opt for more PGs.  If I hadn't contracted the cluster prior to expanding it I would probably add PGs after all the new OSDs have finished being weighted into the cluster.

On Wed, Oct 9, 2013 at 8:55 PM, Michael Lowe <j.michael.lowe@xxxxxxxxx> wrote:

I had those same questions, I think the answer I got was that it was better to have too few pg's than to have overloaded osd's.  So add osd's then add pg's.  I don't know the best increments to grow in, probably depends largely on the hardware in your osd's.

Sent from my iPad

> On Oct 9, 2013, at 11:34 PM, Guang <yguang11@xxxxxxxxx> wrote:

>

> Thanks Mike. I get your point.

>

> There are still a few things confusing me:

>  1) We expand Ceph cluster by adding more OSDs, which will trigger re-balance PGs across the old & new OSDs, and likely it will break the optimized PG numbers for the cluster.

>   2) We can add more PGs which will trigger re-balance objects across old & new PGs.

>

> So:

>  1) What is the recommended way to expand the cluster by adding OSDs (and potentially adding PGs), should we do them at the same time?

>  2) What is the recommended way to scale a cluster from like 1PB to 2PB, should we scale it to like 1.1PB to 1.2PB or move to 2PB directly?

>

> Thanks,

> Guang

>

>> On Oct 10, 2013, at 11:10 AM, Michael Lowe wrote:

>>

>> There used to be, can't find it right now.  Something like 'ceph osd set pg_num <num>' then 'ceph osd set pgp_num <num>' to actually move your data into the new pg's.  I successfully did it several months ago, when bobtail was current.

>>

>> Sent from my iPad

>>

>>> On Oct 9, 2013, at 10:30 PM, Guang <yguang11@xxxxxxxxx> wrote:

>>>

>>> Thanks Mike.

>>>

>>> Is there any documentation for that?

>>>

>>> Thanks,

>>> Guang

>>>

>>>> On Oct 9, 2013, at 9:58 PM, Mike Lowe wrote:

>>>>

>>>> You can add PGs,  the process is called splitting.  I don't think PG merging, the reduction in the number of PGs, is ready yet.

>>>>

>>>>> On Oct 8, 2013, at 11:58 PM, Guang <yguang11@xxxxxxxxx> wrote:

>>>>>

>>>>> Hi ceph-users,

>>>>> Ceph recommends the PGs number of a pool is (100 * OSDs) / Replicas, per my understanding, the number of PGs for a pool should be fixed even we scale out / in the cluster by adding / removing OSDs, does that mean if we double the OSD numbers, the PG number for a pool is not optimal any more and there is no chance to correct it?

>>>>>

>>>>>

>>>>> Thanks,

>>>>> Guang

>>>>> _______________________________________________

>>>>> ceph-users mailing list

>>>>> ceph-users@xxxxxxxxxxxxxx

>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

>

_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

-- 

Kyle

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com