Re: Adding storage to exiting clusters with minimal impact

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi All,

Thanks for your ideas and recommendations. I've been experimenting with:
https://github.com/cernceph/ceph-scripts/blob/master/tools/ceph-gentle-reweight
by Dan van der Ster and it is producing good results.

It does indeed seem that adjusting the crush weight up from zero is the way to go, rather than the osd reweight.

Best wishes,
Bruno


________________________________________
From: ceph-users [ceph-users-bounces@xxxxxxxxxxxxxx] on behalf of Peter Maloney [peter.maloney@xxxxxxxxxxxxxxxxxxxx]
Sent: 06 July 2017 19:29
To: ceph-users@xxxxxxxxxxxxxx
Subject: Re:  Adding storage to exiting clusters with minimal impact

Here's my possibly unique method... I had 3 nodes with 12 disks each,
and when adding 2 more nodes, I had issues with the common method you
describe, totally blocking clients for minutes, but this worked great
for me:

> my own method
> - osd max backfills = 1 and osd recovery max active = 1
> - create them with crush weight 0 so no peering happens
> - (starting here the script below does it, eg. `ceph_activate_osds 6`
> will set weight 6)
> - after they're up, set them reweight 0
> - then set crush weight to the TB of the disk
> - peering starts, but reweight is 0 so it doesn't block clients
> - when that's done, reweight 1 and it should be faster than the
> previous peering and not bug clients as much
>
>
> # list osds with hosts next to them for easy filtering with awk
> (doesn't support chassis, rack, etc. buckets)
> ceph_list_osd() {
>     ceph osd tree | awk '
>         BEGIN {found=0; host=""};
>         $3 == "host" {found=1; host=$4; getline};
>         $3 == "host" {found=0}
>         found || $3 ~ /osd\./ {print $0 " " host}'
> }
>
> peering_sleep() {
>     echo "sleeping"
>     sleep 2
>     while ceph health | grep -q peer; do
>         echo -n .
>         sleep 1
>     done
>     echo
>     sleep 5
> }
>
> # after an osd is already created, this reweights them to 'activate' them
> ceph_activate_osds() {
>     weight="$1"
>     host=$(hostname -s)
>
>     if [ -z "$weight" ]; then
>         # TODO: somehow make this automatic...
>         # This assumes all disks are the same weight.
>         weight=6.00099
>     fi
>
>     # for crush weight 0 osds, set reweight 0 so the crush weight
> non-zero won't cause as many blocked requests
>     for id in $(ceph_list_osd | awk '$2 == 0 {print $1}'); do
>         ceph osd reweight $id 0 &
>     done
>     wait
>     peering_sleep
>
>     # the harsh reweight which we do slowly
>     for id in $(ceph_list_osd | awk -v host="$host" '$5 == 0 && $7 ==
> host {print $1}'); do
>         echo ceph osd crush reweight "osd.$id" "$weight"
>         ceph osd crush reweight "osd.$id" "$weight"
>         peering_sleep
>     done
>
>     # the light reweight
>     for id in $(ceph_list_osd | awk -v host="$host" '$5 == 0 && $7 ==
> host {print $1}'); do
>         ceph osd reweight $id 1 &
>     done
>     wait
> }


_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux