Re: tunable question

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

For the record, we changed tunables from "hammer" to "optimal", yesterday at 14:00, and it finished this morning at 9:00, so rebalancing took 19 hours.

This was on a small ceph cluster, 24 4TB OSDs spread over three hosts, connected over 10G ethernet. Total amount of data: 32730 GB used, 56650 GB / 89380 GB avail

We set noscrub and no-deepscrub during the rebalance, and our VMs experienced basically no impact.

MJ


On 10/03/2017 05:37 PM, lists wrote:
Thanks Jake, for your extensive reply. :-)

MJ

On 3-10-2017 15:21, Jake Young wrote:

On Tue, Oct 3, 2017 at 8:38 AM lists <lists@xxxxxxxxxxxxx <mailto:lists@xxxxxxxxxxxxx>> wrote:

    Hi,

    What would make the decision easier: if we knew that we could easily
    revert the
      > "ceph osd crush tunables optimal"
    once it has begun rebalancing data?

    Meaning: if we notice that impact is too high, or it will take too long,
    that we could simply again say
      > "ceph osd crush tunables hammer"
    and the cluster would calm down again?


Yes you can revert the tunables back; but it will then move all the data back where it was, so be prepared for that.

Verify you have the following values in ceph.conf. Note that these are the defaults in Jewel, so if they aren’t defined, you’re probably good:
osd_max_backfills=1
osd_recovery_threads=1

You can try to set these (using ceph —inject) if you notice a large impact to your client performance:
osd_recovery_op_priority=1
osd_recovery_max_active=1
osd_recovery_threads=1

I recall this tunables change when we went from hammer to jewel last year. It took over 24 hours to rebalance 122TB on our 110 osd  cluster.

Jake



    MJ

    On 2-10-2017 9:41, Manuel Lausch wrote:
     > Hi,
     >
     > We have similar issues.
     > After upgradeing from hammer to jewel the tunable "choose leave
    stabel"
     > was introduces. If we activate it nearly all data will be moved. The
     > cluster has 2400 OSD on 40 nodes over two datacenters and is
    filled with
     > 2,5 PB Data.
     >
     > We tried to enable it but the backfillingtraffic is to high to be
     > handled without impacting other services on the Network.
     >
     > Do someone know if it is neccessary to enable this tunable? And could      > it be a problem in the future if we want to upgrade to newer versions
     > wihout it enabled?
     >
     > Regards,
     > Manuel Lausch
     >
    _______________________________________________
    ceph-users mailing list
    ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
    http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux