Re: Ceph question

Sage Weil <sage@xxxxxxxxxxx> · Sun, 8 Jul 2012 21:13:22 -0700 (PDT)

On Sun, 8 Jul 2012, HarmeekSingh Bedi wrote:
> Hi there.
> 
>   Quick question - I was looking at the code to understand the CRUSH
> placement based on unique bijection based on the prime number
> generation - I looked at the code and could not locate the specific
> code where we generate p > m as described in the papers.
> bucket_perm_choose seems to be use "jenkins" hash function to choose a
> permutation that is different from the paper ? Would the code still
> generate a unique bijection from set {1....m} -> some permutation of
> the same set to garauntee that the replica is not placed on the same
> sub-cluster devices
> 
> Am I missing something here? Kindly help me understand.

The original code used the prime number arithmetic to generate a 
permutation, but the current code does not; the prime number method did 
not in fact generate a purely uniform distribution, but favored certain 
choices.

In fact, the current implementation is not perfect either (although it is 
better), as later items in the permuted set are more likely to be swapped 
at least once.

In reality, it probably doesn't matter much.  Ceph uses 'straw' buckets by 
default (not uniform buckets), which do not have these problems, and are 
more flexible in general.  The computation cost of doing so is highly 
unlikely to be significant.

sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html