Re: trying to understanding crush more deeply

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Per section 3.4.4 The default bucket type straw computes the hash of (PG number, replica number, bucket id) for all buckets using the Jenkins integer hashing function, then multiply this by bucket weight (for OSD disks the weight of 1 is for 1 TB, for higher level it is the sum of contained weights). The selection function chooses the bucket/disk with the max value:
c(r,x) = maxi ( f (wi)hash(x, r, i))

So if you add a OSD disk, there is a new disk id that enters this competition and will get PG from other OSDs proportional to its weight, which is a desirable effect, but a side effect is that the weight hierarchy has slightly changed so now some older buckets may win PGs from other older buckets according to the hash function.

So straw does have overhead when adding (rather than replacing), it does not do minimal PG re-assignments. But it terms of overall efficiency of adding/removing of buckets at end and in middle of hierarchy it is the best overall over other algorithms as seen on chart 5 and table 2.

On 2017-09-22 08:36, Will Zhao wrote:

Hi Sage  and all :
    I am tring to understand cursh more deeply. I have tried to read
the code and paper, and search the mail list archives ,  but I still
have some questions and can't understand it well.
    If I have 100 osds, and when I add a osd ,  the osdmap changes,
and how the pg is recaulated to make sure the data movement is
minimal.  I tried to use crushtool --show-mappings --num-rep 3  --test
 -i map , through changing the map for 100osds and 101 osds , to look
the result , it looks like the pgmap changed a lot .  Shouldn't the
remap  only happen to some of the pgs ? Or crush from adding  a pg is
different from a new osdmap ? I konw I must understand something
wrong. I appreciate if you can explain more about the logic of adding
a osd . Or is there  more doc that I can read ? Thank you very much
!!! : )
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

 

 
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux