Weights: Hosts vs. OSDs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Good evening,

for some time we have the problem that ceph stores too much data on
a host with small disks. Originally we used weight 1 = 1 TB, but
we reduced the weight for this particular host further to keep it
somehow alive.

Our setup currently consists of 3 hosts:

    wein: 6x 136G (fest disks)
    kaffee: 1x 5.5T (slow disk)
    tee: 1x 5.5T (slow disk)

We originally started with 6 osds on wein with a weight of 0.13, but
had to reduce it to 0.05, because the disks were running full.

The current tree looks as following:

root@wein:~# ceph osd tree
# id    weight  type name   up/down reweight
-1  2.3 root default
-2  0.2999      host wein
0   0.04999         osd.0   up  1   
3   0.04999         osd.3   up  1   
4   0.04999         osd.4   up  1   
5   0.04999         osd.5   up  1   
6   0.04999         osd.6   up  1   
7   0.04999         osd.7   up  1   
-3  1       host tee
1   5.5         osd.1   up  1   
-4  1       host kaffee
2   5.5         osd.2   up  1


The hosts have the following disk usage:

root@wein:~# df -h | grep ceph
/dev/sdc1       136G   58G   79G  43% /var/lib/ceph/osd/ceph-0
/dev/sdd1       136G   54G   83G  40% /var/lib/ceph/osd/ceph-3
/dev/sde1       136G   31G  105G  23% /var/lib/ceph/osd/ceph-4
/dev/sdf1       136G   62G   75G  46% /var/lib/ceph/osd/ceph-5
/dev/sdg1       136G   45G   92G  33% /var/lib/ceph/osd/ceph-6
/dev/sdh1       136G   28G  109G  21% /var/lib/ceph/osd/ceph-7

root@kaffee:~# df -h | grep ceph
/dev/sdc                  5.5T  983G  4.5T  18% /var/lib/ceph/osd/ceph-2

root@tee:~# df -h | grep ceph
/dev/sdb        5.5T  967G  4.6T  18% /var/lib/ceph/osd/ceph-1


On wein 48G are stored on average per osd, tee/kaffee store 952G on average.
    (58+64+31+62+45+28)/6 = 48.0
    (967+938)/2 = 952.5


The weight relation from wein osd to kaffee/tee osd is 
    5.5/0.05 = 110.0

The usage relation from wein osd to kaffee/tee osd is
    (967+938)/2) = 952.5
    952.5/48 = 19.84375

So ceph is allocating 5.5 times more storage to wein osds than 
what we want it do:
    110/(952.5/48) = 5.543307086614173

We are also a bit puzzled that the host weight for wein is 0.3 and
tee/kaffee is 1. So wein is the sum of the OSDs, but kaffee and tee it is not.
However looking at the crushmap, the host weight is being displayed as 5.5!

Has anyone an idea what may be going wrong here? 

While writing this I noted that the relation / factor is exactly 5.5 times
wrong, so I *guess* that ceph treats all hosts with the same weight (even though
it looks differently to me in the osd tree and the crushmap)?

You find our crushmap attached below.

Cheers,

Nico

# begin crush map
tunable choose_local_tries 0
tunable choose_local_fallback_tries 0
tunable choose_total_tries 50
tunable chooseleaf_descend_once 1

# devices
device 0 osd.0
device 1 osd.1
device 2 osd.2
device 3 osd.3
device 4 osd.4
device 5 osd.5
device 6 osd.6
device 7 osd.7

# types
type 0 osd
type 1 host
type 2 chassis
type 3 rack
type 4 row
type 5 pdu
type 6 pod
type 7 room
type 8 datacenter
type 9 region
type 10 root

# buckets
host wein {
        id -2           # do not change unnecessarily
        # weight 0.300
        alg straw
        hash 0  # rjenkins1
        item osd.0 weight 0.050
        item osd.3 weight 0.050
        item osd.4 weight 0.050
        item osd.5 weight 0.050
        item osd.6 weight 0.050
        item osd.7 weight 0.050
}
host tee {
        id -3           # do not change unnecessarily
        # weight 5.500
        alg straw
        hash 0  # rjenkins1
        item osd.1 weight 5.500
}
host kaffee {
        id -4           # do not change unnecessarily
        # weight 5.500
        alg straw
        hash 0  # rjenkins1
        item osd.2 weight 5.500
}
root default {
        id -1           # do not change unnecessarily
        # weight 2.300
        alg straw
        hash 0  # rjenkins1
        item wein weight 0.300
        item tee weight 1.000
        item kaffee weight 1.000
}

# rules
rule replicated_ruleset {
        ruleset 0
        type replicated
        min_size 1
        max_size 10
        step take default
        step chooseleaf firstn 0 type host
        step emit
}



-- 
New PGP key: 659B 0D91 E86E 7E24 FD15  69D0 C729 21A1 293F 2D24
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux