Run 'ceph osd crush tunables optimal' or adjust an offline map file via the crushtool command line (more annoying) and retest; I suspect that is the problem. http://ceph.com/docs/master/rados/operations/crush-map/#tunables sage On Fri, 3 Jan 2014, Dietmar Maurer wrote: > > In both cases, you only get 2 replicas on the remaining 2 hosts. > > OK, I was able to reproduce this with crushtool. > > > The difference is if you have 4 hosts with 2 osds. In the choose case, you have > > some fraction of the data that chose the down host in the first step (most of the > > attempts, actually!) and then couldn't find a usable osd, leaving you with only 2 > > This is also reproducible. > > > replicas. With chooseleaf that doesn't happen. > > > > The other difference is if you have one of the two OSDs on the host marked out. > > In the choose case, the remaining OSD will get allocated 2x the data; in the > > chooseleaf case, usage will remain proportional with the rest of the cluster and > > the data from the out OSD will be distributed across other OSDs (at least when > > there are > 3 hosts!). > > I see, but data distribution seems not optimal in that case. > > For example using this crush map: > > # types > type 0 osd > type 1 host > type 2 rack > type 3 row > type 4 room > type 5 datacenter > type 6 root > > # buckets > host prox-ceph-1 { > id -2 # do not change unnecessarily > # weight 7.260 > alg straw > hash 0 # rjenkins1 > item osd.0 weight 3.630 > item osd.1 weight 3.630 > } > host prox-ceph-2 { > id -3 # do not change unnecessarily > # weight 7.260 > alg straw > hash 0 # rjenkins1 > item osd.2 weight 3.630 > item osd.3 weight 3.630 > } > host prox-ceph-3 { > id -4 # do not change unnecessarily > # weight 3.630 > alg straw > hash 0 # rjenkins1 > item osd.4 weight 3.630 > } > > host prox-ceph-4 { > id -5 # do not change unnecessarily > # weight 3.630 > alg straw > hash 0 # rjenkins1 > item osd.5 weight 3.630 > } > > root default { > id -1 # do not change unnecessarily > # weight 21.780 > alg straw > hash 0 # rjenkins1 > item prox-ceph-1 weight 7.260 # 2 OSDs > item prox-ceph-2 weight 7.260 # 2 OSDs > item prox-ceph-3 weight 3.630 # 1 OSD > item prox-ceph-4 weight 3.630 # 1 OSD > } > > # rules > rule data { > ruleset 0 > type replicated > min_size 1 > max_size 10 > step take default > step chooseleaf firstn 0 type host > step emit > } > # end crush map > > crushtool shows the following utilization: > > # crushtool --test -i my.map --rule 0 --num-rep 3 --show-utilization > device 0: 423 > device 1: 452 > device 2: 429 > device 3: 452 > device 4: 661 > device 5: 655 > > Any explanation for that? Maybe related to the small number of devices? > > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com