Re: OSD rebalancing issue - should drives be distributed equally over all nodes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Reed,

I'm not sure what is meant with the grouping / chassis and "set your failure domain to chassis" respectively.

This is my current crush map:
# begin crush map
tunable choose_local_tries 0
tunable choose_local_fallback_tries 0
tunable choose_total_tries 50
tunable chooseleaf_descend_once 1
tunable chooseleaf_vary_r 1
tunable chooseleaf_stable 1
tunable straw_calc_version 1
tunable allowed_bucket_algs 54

# devices
device 0 osd.0 class nvme
device 1 osd.1 class nvme
device 2 osd.2 class nvme
device 3 osd.3 class nvme
device 4 osd.4 class nvme
device 5 osd.5 class nvme
device 6 osd.6 class nvme
device 7 osd.7 class nvme
device 8 osd.8 class hdd
device 9 osd.9 class hdd
device 10 osd.10 class hdd
[...]
device 365 osd.365 class hdd
device 366 osd.366 class hdd
device 367 osd.367 class hdd

# types
type 0 osd
type 1 host
type 2 chassis
type 3 rack
type 4 row
type 5 pdu
type 6 pod
type 7 room
type 8 datacenter
type 9 region
type 10 root

# buckets
host ld5505 {
    id -3        # do not change unnecessarily
    id -4 class hdd        # do not change unnecessarily
    id -5 class nvme        # do not change unnecessarily
    # weight 33.720
    alg straw2
    hash 0    # rjenkins1
    item osd.8 weight 1.640
    item osd.9 weight 1.640
    item osd.10 weight 1.640
    item osd.11 weight 1.640
    item osd.12 weight 1.640
    item osd.13 weight 1.640
    item osd.14 weight 1.640
    item osd.15 weight 1.640
    item osd.16 weight 1.640
    item osd.17 weight 1.640
    item osd.18 weight 1.640
    item osd.19 weight 1.640
    item osd.20 weight 1.640
    item osd.21 weight 1.640
    item osd.22 weight 1.640
    item osd.23 weight 1.640
    item osd.24 weight 1.640
    item osd.0 weight 2.920
    item osd.1 weight 2.920
}
host ld5506 {
    id -7        # do not change unnecessarily
    id -8 class hdd        # do not change unnecessarily
    id -9 class nvme        # do not change unnecessarily
    # weight 33.720
    alg straw2
    hash 0    # rjenkins1
    item osd.25 weight 1.640
    item osd.26 weight 1.640
    item osd.27 weight 1.640
    item osd.28 weight 1.640
    item osd.29 weight 1.640
    item osd.30 weight 1.640
    item osd.31 weight 1.640
    item osd.32 weight 1.640
    item osd.33 weight 1.640
    item osd.34 weight 1.640
    item osd.35 weight 1.640
    item osd.36 weight 1.640
    item osd.37 weight 1.640
    item osd.38 weight 1.640
    item osd.39 weight 1.640
    item osd.40 weight 1.640
    item osd.41 weight 1.640
    item osd.2 weight 2.920
    item osd.3 weight 2.920
}
host ld5507 {
    id -10        # do not change unnecessarily
    id -11 class hdd        # do not change unnecessarily
    id -12 class nvme        # do not change unnecessarily
    # weight 33.720
    alg straw2
    hash 0    # rjenkins1
    item osd.172 weight 1.640
    item osd.173 weight 1.640
    item osd.174 weight 1.640
    item osd.175 weight 1.640
    item osd.176 weight 1.640
    item osd.177 weight 1.640
    item osd.178 weight 1.640
    item osd.179 weight 1.640
    item osd.180 weight 1.640
    item osd.181 weight 1.640
    item osd.182 weight 1.640
    item osd.183 weight 1.640
    item osd.184 weight 1.640
    item osd.185 weight 1.640
    item osd.186 weight 1.640
    item osd.187 weight 1.640
    item osd.188 weight 1.640
    item osd.4 weight 2.920
    item osd.5 weight 2.920
}
host ld5508 {
    id -13        # do not change unnecessarily
    id -14 class hdd        # do not change unnecessarily
    id -15 class nvme        # do not change unnecessarily
    # weight 33.720
    alg straw2
    hash 0    # rjenkins1
    item osd.60 weight 1.640
    item osd.61 weight 1.640
    item osd.62 weight 1.640
    item osd.63 weight 1.640
    item osd.64 weight 1.640
    item osd.65 weight 1.640
    item osd.66 weight 1.640
    item osd.69 weight 1.640
    item osd.70 weight 1.640
    item osd.71 weight 1.640
    item osd.72 weight 1.640
    item osd.73 weight 1.640
    item osd.74 weight 1.640
    item osd.75 weight 1.640
    item osd.59 weight 1.640
    item osd.67 weight 1.640
    item osd.68 weight 1.640
    item osd.6 weight 2.920
    item osd.7 weight 2.920
}
host ld4464 {
    id -34        # do not change unnecessarily
    id -36 class hdd        # do not change unnecessarily
    id -35 class nvme        # do not change unnecessarily
    # weight 2.000
    alg straw2
    hash 0    # rjenkins1
    item osd.268 weight 1.000
    item osd.269 weight 1.000
}
host ld4465 {
    id -37        # do not change unnecessarily
    id -39 class hdd        # do not change unnecessarily
    id -38 class nvme        # do not change unnecessarily
    # weight 2.000
    alg straw2
    hash 0    # rjenkins1
    item osd.270 weight 1.000
    item osd.271 weight 1.000
}
root default {
    id -1        # do not change unnecessarily
    id -2 class hdd        # do not change unnecessarily
    id -6 class nvme        # do not change unnecessarily
    # weight 138.880
    alg straw2
    hash 0    # rjenkins1
    item ld5505 weight 33.720
    item ld5506 weight 33.720
    item ld5507 weight 33.720
    item ld5508 weight 33.720
    item ld4464 weight 2.000
    item ld4465 weight 2.000
}
host ld5505-hdd_strgbox {
    id -16        # do not change unnecessarily
    id -18 class hdd        # do not change unnecessarily
    id -20 class nvme        # do not change unnecessarily
    # weight 78.720
    alg straw2
    hash 0    # rjenkins1
    item osd.76 weight 1.640
    item osd.77 weight 1.640
    item osd.78 weight 1.640
    item osd.79 weight 1.640
    item osd.80 weight 1.640
    item osd.81 weight 1.640
    item osd.82 weight 1.640
    item osd.83 weight 1.640
    item osd.84 weight 1.640
    item osd.85 weight 1.640
    item osd.86 weight 1.640
    item osd.87 weight 1.640
    item osd.88 weight 1.640
    item osd.89 weight 1.640
    item osd.90 weight 1.640
    item osd.91 weight 1.640
    item osd.92 weight 1.640
    item osd.93 weight 1.640
    item osd.94 weight 1.640
    item osd.95 weight 1.640
    item osd.96 weight 1.640
    item osd.97 weight 1.640
    item osd.98 weight 1.640
    item osd.99 weight 1.640
    item osd.100 weight 1.640
    item osd.101 weight 1.640
    item osd.102 weight 1.640
    item osd.103 weight 1.640
    item osd.104 weight 1.640
    item osd.105 weight 1.640
    item osd.106 weight 1.640
    item osd.107 weight 1.640
    item osd.108 weight 1.640
    item osd.109 weight 1.640
    item osd.110 weight 1.640
    item osd.111 weight 1.640
    item osd.112 weight 1.640
    item osd.113 weight 1.640
    item osd.114 weight 1.640
    item osd.115 weight 1.640
    item osd.116 weight 1.640
    item osd.117 weight 1.640
    item osd.118 weight 1.640
    item osd.119 weight 1.640
    item osd.120 weight 1.640
    item osd.121 weight 1.640
    item osd.122 weight 1.640
    item osd.123 weight 1.640
}
host ld5506-hdd_strgbox {
    id -22        # do not change unnecessarily
    id -23 class hdd        # do not change unnecessarily
    id -24 class nvme        # do not change unnecessarily
    # weight 78.720
    alg straw2
    hash 0    # rjenkins1
    item osd.124 weight 1.640
    item osd.125 weight 1.640
    item osd.126 weight 1.640
    item osd.127 weight 1.640
    item osd.128 weight 1.640
    item osd.129 weight 1.640
    item osd.130 weight 1.640
    item osd.131 weight 1.640
    item osd.132 weight 1.640
    item osd.133 weight 1.640
    item osd.134 weight 1.640
    item osd.135 weight 1.640
    item osd.136 weight 1.640
    item osd.137 weight 1.640
    item osd.138 weight 1.640
    item osd.139 weight 1.640
    item osd.140 weight 1.640
    item osd.141 weight 1.640
    item osd.142 weight 1.640
    item osd.143 weight 1.640
    item osd.144 weight 1.640
    item osd.145 weight 1.640
    item osd.146 weight 1.640
    item osd.147 weight 1.640
    item osd.148 weight 1.640
    item osd.149 weight 1.640
    item osd.150 weight 1.640
    item osd.151 weight 1.640
    item osd.152 weight 1.640
    item osd.153 weight 1.640
    item osd.154 weight 1.640
    item osd.155 weight 1.640
    item osd.156 weight 1.640
    item osd.157 weight 1.640
    item osd.158 weight 1.640
    item osd.159 weight 1.640
    item osd.160 weight 1.640
    item osd.161 weight 1.640
    item osd.162 weight 1.640
    item osd.163 weight 1.640
    item osd.164 weight 1.640
    item osd.165 weight 1.640
    item osd.166 weight 1.640
    item osd.167 weight 1.640
    item osd.168 weight 1.640
    item osd.169 weight 1.640
    item osd.170 weight 1.640
    item osd.171 weight 1.640
}
host ld5507-hdd_strgbox {
    id -25        # do not change unnecessarily
    id -26 class hdd        # do not change unnecessarily
    id -27 class nvme        # do not change unnecessarily
    # weight 78.720
    alg straw2
    hash 0    # rjenkins1
    item osd.215 weight 1.640
    item osd.216 weight 1.640
    item osd.217 weight 1.640
    item osd.218 weight 1.640
    item osd.219 weight 1.640
    item osd.251 weight 1.640
    item osd.252 weight 1.640
    item osd.253 weight 1.640
    item osd.254 weight 1.640
    item osd.255 weight 1.640
    item osd.256 weight 1.640
    item osd.257 weight 1.640
    item osd.258 weight 1.640
    item osd.259 weight 1.640
    item osd.260 weight 1.640
    item osd.261 weight 1.640
    item osd.262 weight 1.640
    item osd.263 weight 1.640
    item osd.264 weight 1.640
    item osd.265 weight 1.640
    item osd.266 weight 1.640
    item osd.267 weight 1.640
    item osd.189 weight 1.640
    item osd.190 weight 1.640
    item osd.191 weight 1.640
    item osd.192 weight 1.640
    item osd.193 weight 1.640
    item osd.194 weight 1.640
    item osd.195 weight 1.640
    item osd.196 weight 1.640
    item osd.197 weight 1.640
    item osd.198 weight 1.640
    item osd.199 weight 1.640
    item osd.200 weight 1.640
    item osd.201 weight 1.640
    item osd.202 weight 1.640
    item osd.203 weight 1.640
    item osd.204 weight 1.640
    item osd.205 weight 1.640
    item osd.206 weight 1.640
    item osd.207 weight 1.640
    item osd.208 weight 1.640
    item osd.209 weight 1.640
    item osd.210 weight 1.640
    item osd.211 weight 1.640
    item osd.212 weight 1.640
    item osd.213 weight 1.640
    item osd.214 weight 1.640
}
host ld5508-hdd_strgbox {
    id -28        # do not change unnecessarily
    id -29 class hdd        # do not change unnecessarily
    id -30 class nvme        # do not change unnecessarily
    # weight 78.720
    alg straw2
    hash 0    # rjenkins1
    item osd.42 weight 1.640
    item osd.43 weight 1.640
    item osd.44 weight 1.640
    item osd.45 weight 1.640
    item osd.46 weight 1.640
    item osd.47 weight 1.640
    item osd.48 weight 1.640
    item osd.49 weight 1.640
    item osd.50 weight 1.640
    item osd.51 weight 1.640
    item osd.52 weight 1.640
    item osd.53 weight 1.640
    item osd.54 weight 1.640
    item osd.55 weight 1.640
    item osd.56 weight 1.640
    item osd.57 weight 1.640
    item osd.58 weight 1.640
    item osd.220 weight 1.640
    item osd.221 weight 1.640
    item osd.222 weight 1.640
    item osd.223 weight 1.640
    item osd.224 weight 1.640
    item osd.225 weight 1.640
    item osd.226 weight 1.640
    item osd.227 weight 1.640
    item osd.228 weight 1.640
    item osd.230 weight 1.640
    item osd.231 weight 1.640
    item osd.232 weight 1.640
    item osd.233 weight 1.640
    item osd.234 weight 1.640
    item osd.235 weight 1.640
    item osd.236 weight 1.640
    item osd.237 weight 1.640
    item osd.238 weight 1.640
    item osd.239 weight 1.640
    item osd.240 weight 1.640
    item osd.241 weight 1.640
    item osd.242 weight 1.640
    item osd.243 weight 1.640
    item osd.244 weight 1.640
    item osd.245 weight 1.640
    item osd.246 weight 1.640
    item osd.247 weight 1.640
    item osd.248 weight 1.640
    item osd.249 weight 1.640
    item osd.250 weight 1.640
    item osd.229 weight 1.640
}
host ld4464-hdd_strgbox {
    id -31        # do not change unnecessarily
    id -33 class hdd        # do not change unnecessarily
    id -32 class nvme        # do not change unnecessarily
    # weight 349.440
    alg straw2
    hash 0    # rjenkins1
    item osd.272 weight 7.280
    item osd.273 weight 7.280
    item osd.274 weight 7.280
    item osd.275 weight 7.280
    item osd.276 weight 7.280
    item osd.277 weight 7.280
    item osd.278 weight 7.280
    item osd.279 weight 7.280
    item osd.280 weight 7.280
    item osd.281 weight 7.280
    item osd.282 weight 7.280
    item osd.283 weight 7.280
    item osd.284 weight 7.280
    item osd.285 weight 7.280
    item osd.286 weight 7.280
    item osd.287 weight 7.280
    item osd.288 weight 7.280
    item osd.289 weight 7.280
    item osd.290 weight 7.280
    item osd.291 weight 7.280
    item osd.292 weight 7.280
    item osd.293 weight 7.280
    item osd.294 weight 7.280
    item osd.295 weight 7.280
    item osd.296 weight 7.280
    item osd.297 weight 7.280
    item osd.298 weight 7.280
    item osd.299 weight 7.280
    item osd.300 weight 7.280
    item osd.302 weight 7.280
    item osd.303 weight 7.280
    item osd.304 weight 7.280
    item osd.305 weight 7.280
    item osd.306 weight 7.280
    item osd.307 weight 7.280
    item osd.308 weight 7.280
    item osd.309 weight 7.280
    item osd.310 weight 7.280
    item osd.311 weight 7.280
    item osd.312 weight 7.280
    item osd.313 weight 7.280
    item osd.314 weight 7.280
    item osd.315 weight 7.280
    item osd.316 weight 7.280
    item osd.317 weight 7.280
    item osd.318 weight 7.280
    item osd.301 weight 7.280
    item osd.319 weight 7.280
}
host ld4465-hdd_strgbox {
    id -40        # do not change unnecessarily
    id -42 class hdd        # do not change unnecessarily
    id -41 class nvme        # do not change unnecessarily
    # weight 349.440
    alg straw2
    hash 0    # rjenkins1
    item osd.320 weight 7.280
    item osd.321 weight 7.280
    item osd.322 weight 7.280
    item osd.323 weight 7.280
    item osd.324 weight 7.280
    item osd.325 weight 7.280
    item osd.326 weight 7.280
    item osd.327 weight 7.280
    item osd.328 weight 7.280
    item osd.329 weight 7.280
    item osd.330 weight 7.280
    item osd.331 weight 7.280
    item osd.332 weight 7.280
    item osd.333 weight 7.280
    item osd.334 weight 7.280
    item osd.335 weight 7.280
    item osd.336 weight 7.280
    item osd.337 weight 7.280
    item osd.338 weight 7.280
    item osd.339 weight 7.280
    item osd.340 weight 7.280
    item osd.341 weight 7.280
    item osd.342 weight 7.280
    item osd.343 weight 7.280
    item osd.344 weight 7.280
    item osd.345 weight 7.280
    item osd.346 weight 7.280
    item osd.347 weight 7.280
    item osd.348 weight 7.280
    item osd.349 weight 7.280
    item osd.350 weight 7.280
    item osd.351 weight 7.280
    item osd.352 weight 7.280
    item osd.353 weight 7.280
    item osd.354 weight 7.280
    item osd.355 weight 7.280
    item osd.356 weight 7.280
    item osd.357 weight 7.280
    item osd.358 weight 7.280
    item osd.360 weight 7.280
    item osd.361 weight 7.280
    item osd.362 weight 7.280
    item osd.363 weight 7.280
    item osd.364 weight 7.280
    item osd.365 weight 7.280
    item osd.366 weight 7.280
    item osd.367 weight 7.280
    item osd.359 weight 7.280
}
root hdd_strgbox {
    id -17        # do not change unnecessarily
    id -19 class hdd        # do not change unnecessarily
    id -21 class nvme        # do not change unnecessarily
    # weight 1013.760
    alg straw2
    hash 0    # rjenkins1
    item ld5505-hdd_strgbox weight 78.720
    item ld5506-hdd_strgbox weight 78.720
    item ld5507-hdd_strgbox weight 78.720
    item ld5508-hdd_strgbox weight 78.720
    item ld4464-hdd_strgbox weight 349.440
    item ld4465-hdd_strgbox weight 349.440
}

# rules
rule replicated_rule {
    id 0
    type replicated
    min_size 1
    max_size 10
    step take default
    step chooseleaf firstn 0 type host
    step emit
}
rule hdd_strgbox_rule {
    id 1
    type replicated
    min_size 1
    max_size 10
    step take hdd_strgbox class hdd
    step chooseleaf firstn 0 type host
    step emit
}
rule nvme_rule {
    id 2
    type replicated
    min_size 1
    max_size 10
    step take default class nvme
    step chooseleaf firstn 0 type host
    step emit
}
rule hdd_rule {
    id 3
    type replicated
    min_size 1
    max_size 10
    step take default class hdd
    step chooseleaf firstn 0 type host
    step emit
}

# end crush map


Am 24.09.2019 um 23:33 schrieb Reed Dier:
Hi Thomas,

How does your crush map/tree look?

If your crush failure domain is by host, then your 96x 8T disks will be as useful as you're 1.6T disks, because smallest failure domain is your limiting factor.

So you can either redistribute your disks to be 16x8T+32x1.6T per host, or you could group your 1.6T nodes into groups (chassis perhaps) and move the 8T nodes into their own chassis, and then set your failure domain to chassis, and this would likely lead to a much more even distribution.

I imagine right now you're 1.6T disks are nearful, and your 8T disks are anything but.

Be careful with something like this however, because you will probably run into some iops discrepancies due to number of spindles/TB difference across 'chassis'.

Hope that helps.

Reed

On Sep 23, 2019, at 4:07 AM, Thomas <74cmonty@xxxxxxxxx> wrote:

Hi,

I'm facing several issues with my ceph cluster (2x MDS, 6x ODS).
Here I would like to focus on the issue with pgs backfill_toofull.
I assume this is related to the fact that the data distribution on my
OSDs is not balanced.

This is the current ceph status:
root@ld3955:~# ceph -s
   cluster:
     id:     6b1b5117-6e08-4843-93d6-2da3cf8a6bae
     health: HEALTH_ERR
             1 MDSs report slow metadata IOs
             78 nearfull osd(s)
             1 pool(s) nearfull
             Reduced data availability: 2 pgs inactive, 2 pgs peering
             Degraded data redundancy: 304136/153251211 objects degraded
(0.198%), 57 pgs degraded, 57 pgs undersized
             Degraded data redundancy (low space): 265 pgs backfill_toofull
             3 pools have too many placement groups
             74 slow requests are blocked > 32 sec
             80 stuck requests are blocked > 4096 sec

   services:
     mon: 3 daemons, quorum ld5505,ld5506,ld5507 (age 98m)
     mgr: ld5505(active, since 3d), standbys: ld5506, ld5507
     mds: pve_cephfs:1 {0=ld3976=up:active} 1 up:standby
     osd: 368 osds: 368 up, 367 in; 302 remapped pgs

   data:
     pools:   5 pools, 8868 pgs
     objects: 51.08M objects, 195 TiB
     usage:   590 TiB used, 563 TiB / 1.1 PiB avail
     pgs:     0.023% pgs not active
              304136/153251211 objects degraded (0.198%)
              1672190/153251211 objects misplaced (1.091%)
              8564 active+clean
              196  active+remapped+backfill_toofull
              57   active+undersized+degraded+remapped+backfill_toofull
              35   active+remapped+backfill_wait
              12   active+remapped+backfill_wait+backfill_toofull
              2    active+remapped+backfilling
              2    peering

   io:
     recovery: 18 MiB/s, 4 objects/s


Currently I'm using 6 OSD nodes.
Node A
48x 1.6TB HDD
Node B
48x 1.6TB HDD
Node C
48x 1.6TB HDD
Node D
48x 1.6TB HDD
Node E
48x 7.2TB HDD
Node F
48x 7.2TB HDD

Question:
Is it advisable to distribute the drives equally over all nodes?
If yes, how should this be executed w/o ceph disruption?

Regards
Thomas

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux