jewel to luminous upgrade, chooseleaf_vary_r and chooseleaf_stable

Adrian <aussieade@xxxxxxxxx> · Mon, 14 May 2018 17:01:14 +1000

Hi all,

We recently upgraded our old ceph cluster to jewel (5xmon, 21xstorage hosts with 9x6tb filestore osds and 3xssd's with 3 journals on each) - mostly used for openstack compute/cinder.

In order to get there we had to go with chooseleaf_vary_r = 4 in order to minimize client impact and save time. We now need to get to luminous (on a deadline and time is limited).

Current tunables are:
  {
      "choose_local_tries": 0,
      "choose_local_fallback_tries": 0,
      "choose_total_tries": 50,
      "chooseleaf_descend_once": 1,
      "chooseleaf_vary_r": 4,
      "chooseleaf_stable": 0,
      "straw_calc_version": 1,
      "allowed_bucket_algs": 22,
      "profile": "unknown",
      "optimal_tunables": 0,
      "legacy_tunables": 0,
      "minimum_required_version": "firefly",
      "require_feature_tunables": 1,
      "require_feature_tunables2": 1,
      "has_v2_rules": 0,
      "require_feature_tunables3": 1,
      "has_v3_rules": 0,
      "has_v4_buckets": 0,
      "require_feature_tunables5": 0,
      "has_v5_rules": 0
  }

Setting chooseleaf_stable to 1, the crush compare tool says:
   Replacing the crushmap specified with --origin with the crushmap
  specified with --destination will move 8774 PGs (59.08417508417509% of the total)
  from one item to another.

Current tunings we have in ceph.conf are:
  #THROTTLING CEPH
  osd_max_backfills = 1
  osd_recovery_max_active = 1
  osd_recovery_op_priority = 1
  osd_client_op_priority = 63

  #PERFORMANCE TUNING
  osd_op_threads = 6
  filestore_op_threads = 10
  filestore_max_sync_interval = 30

I was wondering if anyone has any advice as to anything else we can do balancing client impact and speed of recovery or war stories of other things to consider.

I'm also wondering about the interplay between chooseleaf_vary_r and chooseleaf_stable.
Are we better with
1) sticking with choosleaf_vary_r = 4, setting chooseleaf_stable =1, upgrading and then setting chooseleaf_vary_r incrementally to 1 when more time is available
or
2) setting chooseleaf_vary_r incrementally first, then chooseleaf_stable and finally upgrade

All this bearing in mind we'd like to keep the time it takes us to get to luminous as short as possible ;-) (guestimating a 59% rebalance to take many days)

Any advice/thoughts gratefully received.

Regards,
Adrian.

-- 
---
Adrian : aussieade@xxxxxxxxx
If violence doesn't solve your problem, you're not using enough of it.

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com