[dropping ceph-users] On Sat, 10 Oct 2015, wangsongbo wrote: > Hi all, > when an osd is marked out, relative IO will be blocked, in which case, > application built on ceph will fail.According to test result, the larger a > data is,the longer it will take to elapse. > How to reduce the impact of this process on the IO? When you mark an osd out the mon is doing prime_pg_temp, which preemptively remaps the PG to the same OSDs. This should make peering fast... except that the OSDs still have to do a cycle of up_thru updates. Sam, I think we can do the following to avoid this: - in build prior, we can infer that that interval for last_epoch_start is also a rw interval (because clearly it finished peering). - if the acting set and primary do not change, we can skip the up_thru update (because we will already infer rw from above). I think the only caveat is that we can only skip the up_thru update once the entire cluster has a feature bit indicating they understand that last_epoch_started implies rw. Anyway, this would mean that there's no subsequent mon interaction after the mark out (and probably lots of other common scenarios)... What do you think? sage -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html