On Fri, Oct 21, 2016 at 10:35 PM, Ridwan Rashid Noel <ridwan064@xxxxxxxxx> wrote: > Thank you for your reply Greg. Is there any detailed resource that describe > about how the primary affinity changing works? All I got from searching was > one paragraph from the documentation. No, probably nothing detailed. There isn't much to it though. Normally, the first up OSD in the raw OSD set returned by CRUSH is the primary. primary-affinity then sets the probability with which the mapping code will choose some other OSD from the raw OSD set to serve as primary. (Your quote is technically incorrect, as this adjustment happens after CRUSH is run, not within CRUSH.) You can think of it as a step in the mapping process, which will potentially reorder OSDs in the OSD set. Say your CRUSH output is [osd0, osd1, osd2] and all of them are up. If osd0's primary-affinity is 1, osd0 will be the primary with p=1 (i.e. always). If osd0's primary-affinity is 0.75, either osd1 or osd2 will take its place with p=0.25 and the resulting OSD set will be one of: [osd0, osd1, osd2], p=0.75 [osd1, osd0, osd2] or [osd2, osd0, osd1], p=0.25 Once you've adjusted the primary-affinity, this choice is fixed until you either re-adjust the primary-affinity or the mapping is affected in some other way (e.g. OSD goes down, etc). Another way to look at it is as follows. If, according to CRUSH + current osdmap, osd0 gets to be the primary for 100 PGs, 25 of those PGs will be rejected and get another primary, leaving osd0 with 75 PGs. Why would you want to do this? To quickly redistribute the read workload away from overloaded OSDs (writes go to all OSDs in the PG, while reads are served by the primary). Thanks, Ilya _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com