Re: Help rebalancing OSD usage, Luminus 1.2.2

Bryan Banister <bbanister@xxxxxxxxxxxxxxx> · Wed, 31 Jan 2018 14:58:09 +0000

Thanks for the response, Janne!

Here is what test-reweight-by-utilization gives me:

[root@carf-ceph-osd01 ~]# ceph osd test-reweight-by-utilization
no change
moved 12 / 4872 (0.246305%)
avg 36.6316
stddev 5.37535 -> 5.29218 (expected baseline 6.02961)
min osd.48 with 25 -> 25 pgs (0.682471 -> 0.682471 * mean)
max osd.90 with 48 -> 48 pgs (1.31034 -> 1.31034 * mean)

oload 120
max_change 0.05
max_change_osds 4
average_utilization 0.5273
overload_utilization 0.6327
osd.113 weight 1.0000 -> 0.9500
osd.87 weight 1.0000 -> 0.9500
osd.29 weight 1.0000 -> 0.9500
osd.52 weight 1.0000 -> 0.9500

I tried looking for documentation on this command to see if there is a way to increase the max_change or max_change_osd’s but can’t find any docs on how to do
 this!

Man:
       Subcommand reweight-by-utilization reweight OSDs by utilization [overload-percentage-for-consideration, default 120].

       Usage:

          ceph osd reweight-by-utilization {<int[100-]>}
          {--no-increasing}

The `ceph –h` output:
          osd reweight-by-utilization {<int>} {<float>} {<int>} {--no-increasing}

What do those optional parameters do (e.g. {<int>} {<float>} {<int>} {--no-increasing} )??

We could keep running this multiple times, but would be nice to just rebalance everything in one shot so that the rebalance gets things back to pretty even.

Yes, these backup images do vary greatly in size, but I expected that just through random PG allocation that all OSDs would have still accumulated roughly the
 same number of small and large objects that the usage would be much closer to even.  This usage is way imbalanced!  So I still need to know how to mitigate this going forward.  Should we increase the number of PGs in this pool??

[root@carf-ceph-osd01 ~]# ceph osd pool ls detail
[snip]
pool 14 'carf01.rgw.buckets.data' erasure size 3 min_size 2 crush_rule 7 object_hash rjenkins pg_num 512 pgp_num 512 last_change 3187 lfor 0/1005 flags hashpspool,nearfull
 stripe_width 8192 application rgw
[snip]

Given that this will move data around (I think), should we increase the pg_num and pgp_num first and then see how it looks?

Thanks,
-Bryan

From: Janne Johansson [mailto:icepic.dz@xxxxxxxxx]

Sent: Wednesday, January 31, 2018 7:53 AM

To: Bryan Banister <bbanister@xxxxxxxxxxxxxxx>

Cc: Ceph Users <ceph-users@xxxxxxxxxxxxxx>

Subject: Re: [ceph-users] Help rebalancing OSD usage, Luminus 1.2.2

Note: External Email

2018-01-30 17:24 GMT+01:00 Bryan Banister <bbanister@xxxxxxxxxxxxxxx>:

Hi all,

We are still very new to running a Ceph cluster and have run a RGW cluster for a while now (6-ish mo), it mainly holds large DB backups (Write once, read once, delete after N days). 
 The system is now warning us about an OSD that is near_full and so we went to look at the usage across OSDs.  We are somewhat surprised at how imbalanced the usage is across the OSDs, with the lowest usage at 22% full, the highest at nearly 90%, and an almost
 linear usage pattern across the OSDs (though it looks to step in roughly 5% increments):

[root@carf-ceph-osd01 ~]# ceph osd df | sort -nk8
ID  CLASS WEIGHT  REWEIGHT SIZE  USE   AVAIL %USE  VAR  PGS
77   hdd 7.27730  1.00000 7451G 1718G 5733G 23.06 0.43  32
73   hdd 7.27730  1.00000 7451G 1719G 5732G 23.08 0.43  31

I noticed that the PGs (the last column there, which counts PGs per OSD I gather) was kind of even,

so perhaps the objects that get into the PGs are very unbalanced in size?

But yes, using reweight to compensate for this should work for you.

ceph osd test-reweight-by-utilization 

should be worth testing.

-- 

May the most significant bit of your life be positive.

Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this
 email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness
 or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial
 product.

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com