Re: Rack weight imbalance

"Chen, Xiaoxi" <superdebugger@xxxxxxxxxxx> · Tue, 23 Feb 2016 08:49:51 +0800

My 0.02,  there are two kinds of balance, one for space utilization , another for performance.

Now seems you will be good for the space utilization, but you might suffer a bit for the performance as the density of disk increase.The new rack will hold 1/3 data by 1/5 disks, if we assume the workload even distributed(# of request / amout of data = const N), the new racks will become bottleneck.

Primary affinity might help (that leverage all read requests to the old racks), or maybe your disk is fairly idle so it is not a problem at all:)

-Xiaoxi

On 2/23/16, 4:19 AM, "ceph-users on behalf of Gregory Farnum" <ceph-users-bounces@xxxxxxxxxxxxxx on behalf of gfarnum@xxxxxxxxxx> wrote:

>On Mon, Feb 22, 2016 at 9:29 AM, George Mihaiescu <lmihaiescu@xxxxxxxxx> wrote:
>> Hi,
>>
>> We have a fairly large Ceph cluster (3.2 PB) that we want to expand and we
>> would like to get your input on this.
>>
>> The current cluster has around 700 OSDs (4 TB and 6 TB) in three racks with
>> the largest pool being rgw and using a replica 3.
>> For non-technical reasons (budgetary, etc) we are considering getting three
>> more racks, but initially adding only two storage nodes with 36 x 8 TB
>> drives in each, which will basically cause the rack weights to be imbalanced
>> (three racks with weight around a 1000 and 288 OSDs, and three racks with
>> weight around 500 but only 72 OSDs)
>>
>> The one replica per rack CRUSH rule will cause existing data to be
>> re-balanced among all six racks, with OSDs in the new racks getting only a
>> proportionate amount of replicas.
>>
>> Do you see any possible problems with this approach? Should Ceph be able to
>> properly rebalance the existing data among racks with imbalanced weights?
>>
>> Thank you for your input and please let me know if you need additional info.
>
>This should be okay; you have multiple racks in each size and aren't
>trying to replicate a full copy to each rack individually. You can
>test it ahead of time with the crush tool, though:
>http://docs.ceph.com/docs/master/man/8/crushtool/
>It may turn out you're using old tunables and want to update them
>first or something.
>-Greg
>_______________________________________________
>ceph-users mailing list
>ceph-users@xxxxxxxxxxxxxx
>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com