Re: expanding cluster with minimal impact

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Laszlo,

I've used Dan's script to deploy 9 storage nodes (36 x 6TB data disks/node) into our dev cluster as practice for deployment into our production cluster.

The script performs very well. In general, disruption to a cluster (e.g. impact on client I/O) is minimised by osd_max_backfills which take a default value of 1 if not defined in ceph.conf. I did find that with 324 OSDs, 60s was too short a time period for one reweight run to complete before the next run started, but this is configurable.

The command I ran was:
/usr/local/bin/ceph-gentle-reweight -o osd.78,...,osd.401 -b 0 -d 0.01 -t 5.458 -l 100 -p dteam -i 300 -r

With 85TB of data on 342TB of capacity that was grown to 2286TB, the process took 57h to get to 25% of target, 87h to get to 50%, 110h to get to 75% and 128h to complete.

Best wishes,
Bruno


Bruno Canning
LHC Data Store System Administrator
Scientific Computing Department
STFC Rutherford Appleton Laboratory
Harwell Oxford
Didcot
OX11 0QX
Tel. +44 ((0)1235) 446621


-----Original Message-----
From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of Dan van der Ster
Sent: 04 August 2017 08:58
To: Laszlo Budai
Cc: ceph-users
Subject: Re:  expanding cluster with minimal impact

Hi Laszlo,

The script defaults are what we used to do a large intervention (the default delta weight is 0.01). For our clusters going any faster becomes disruptive, but this really depends on your cluster size and activity.

BTW, in case it wasn't clear, to use this script for adding capacity you need to create the new OSDs to your cluster with initial crush weight = 0.0

osd crush initial weight = 0
osd crush update on start = true

-- Dan



On Thu, Aug 3, 2017 at 8:12 PM, Laszlo Budai <laszlo@xxxxxxxxxxxxxxxx> wrote:
> Dear all,
>
> I need to expand a ceph cluster with minimal impact. Reading previous 
> threads on this topic from the list I've found the 
> ceph-gentle-reweight script
> (https://github.com/cernceph/ceph-scripts/blob/master/tools/ceph-gentl
> e-reweight) created by Dan van der Ster (Thank you Dan for sharing the 
> script with us!).
>
> I've done some experiments, and it looks promising, but it is needed 
> to properly set the parameters. Did any of you tested this script 
> before? what is the recommended delta_weight to be used? From the 
> default parameters of the script I can see that the default delta 
> weight is .5% of the target weight that means 200 reweighting cycles. 
> I have experimented with a reweight ratio of 5% while running a fio 
> test on a client. The results were OK (I mean no slow requests), but my  test cluster was a very small one.
>
> If any of you has done some larger experiments with this script I 
> would be really interested to read about your results.
>
> Thank you!
> Laszlo
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux