On Wed, May 30, 2018 at 9:19 AM, zengran zhang <z13121369189@xxxxxxxxx> wrote: > Hi > Let's say acting set is [3, 1, 0], obj1 was marked missing on > osd.0 after peering, new io on obj1 will wait obj1 until be recovered. > So my question is why cant we do the new io on [3, 1] and let osd.0 > keep missing obj1 without wait on recover, osd.0 update pglog only > like backfill does? if the size of osds with newest object is more > than min_size, do we need to wait recover? This isn't impossible to do, but it's *yet another* piece of metadata we'd need to keep track of and account for in all the other recovery and IO paths, so nobody's done it yet. Somebody would have to design the algorithms, write the code, and persuade us the UX/performance improvement is worth the ongoing maintenance burden. In particular, note that any write-before-recovery needs to make sure it still obeys the min_size rules, and I don't think any systems are set up to enable that tracking separate from the acting set right now. I haven't looked closely at this code in a while so I don't know how easy it would be to implement, or how high the bar for accepting such a PR might be. It's not a bad thing to look at AFAIK, though! :) -Greg > > i see the new async recover feature move the osd.0 from acting to > async_recover_target,keeping acting set bigger than min_size, and > osd.0 being choosen is because it have more missing objects, so > objects missing on acting set still need recover first... > > best regards > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html