Re: crush chooseleaf vs. choose

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> In both cases, you only get 2 replicas on the remaining 2 hosts.

OK, I was able to reproduce this with crushtool.

> The difference is if you have 4 hosts with 2 osds.  In the choose case, you have
> some fraction of the data that chose the down host in the first step (most of the
> attempts, actually!) and then couldn't find a usable osd, leaving you with only 2

This is also reproducible.

> replicas.  With chooseleaf that doesn't happen.
> 
> The other difference is if you have one of the two OSDs on the host marked out.
> In the choose case, the remaining OSD will get allocated 2x the data; in the
> chooseleaf case, usage will remain proportional with the rest of the cluster and
> the data from the out OSD will be distributed across other OSDs (at least when
> there are > 3 hosts!).

I see, but data distribution seems not optimal in that case.

For example using this crush map:

# types
type 0 osd
type 1 host
type 2 rack
type 3 row
type 4 room
type 5 datacenter
type 6 root

# buckets
host prox-ceph-1 {
	id -2		# do not change unnecessarily
	# weight 7.260
	alg straw
	hash 0	# rjenkins1
	item osd.0 weight 3.630
	item osd.1 weight 3.630
}
host prox-ceph-2 {
	id -3		# do not change unnecessarily
	# weight 7.260
	alg straw
	hash 0	# rjenkins1
	item osd.2 weight 3.630
	item osd.3 weight 3.630
}
host prox-ceph-3 {
	id -4		# do not change unnecessarily
	# weight 3.630
	alg straw
	hash 0	# rjenkins1
	item osd.4 weight 3.630
}

host prox-ceph-4 {
	id -5		# do not change unnecessarily
	# weight 3.630
	alg straw
	hash 0	# rjenkins1
	item osd.5 weight 3.630
}

root default {
	id -1		# do not change unnecessarily
	# weight 21.780
	alg straw
	hash 0	# rjenkins1
	item prox-ceph-1 weight 7.260   # 2 OSDs
	item prox-ceph-2 weight 7.260   # 2 OSDs
	item prox-ceph-3 weight 3.630   # 1 OSD
	item prox-ceph-4 weight 3.630   # 1 OSD
}

# rules
rule data {
	ruleset 0
	type replicated
	min_size 1
	max_size 10
	step take default
	step chooseleaf firstn 0 type host
	step emit
}
# end crush map

crushtool shows the following utilization:

# crushtool --test -i my.map --rule 0 --num-rep 3 --show-utilization
  device 0:	423
  device 1:	452
  device 2:	429
  device 3:	452
  device 4:	661
  device 5:	655

Any explanation for that?  Maybe related to the small number of devices?

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux