I had a question regarding how OSD locations are determined by CRUSH.
From the CRUSH paper I gather that the replica locations of an object (A) is a vector (v) that is got by the function c(r,x) = (hash (x) + rp) mod m).
Now when new OSDs are added, objects are shuffled to maintain uniform data distribution. What in the above equation changes so that only minimal movement is achieved. More specifically, if nothing in the above equation changes then all the objects again map to the same locations. If p is changed, then lots of object location can be changed. Therefore, how does CRUSH guarantees only minimal data movement.
Followup question is, if there in an ongoing IO to an object, the primary replica is the one that will be getting updated. Does the re-shuffling in that case do not consider currently hot objects for movement ?
Thanking you sincerely,
Shesha
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com