On Sat, Jul 1, 2017 at 8:25 AM, sheng qiu <herbert1984106@xxxxxxxxx> wrote: > Hi, > > We are trying to reduce the peering processing latency, since it may > block front io. > > in our experiment, we kill certain osd and bring it back after a very > short time. We checked performance counters as below: > > "peering_latency": { > > "avgcount": 52, > > "sum": 52.435308773, > > "avglat": 1008.371323 > > }, > > "getinfo_latency": { > > "avgcount": 52, > > "sum": 3.525831625, > > "avglat": 67.804454 > > }, > > "getlog_latency": { > > "avgcount": 46, > > "sum": 0.255325943, > > "avglat": 5.550564 > > }, > > "getmissing_latency": { > > "avgcount": 46, > > "sum": 0.000877735, > > "avglat": 0.019081 > > }, > > "waitupthru_latency": { > > "avgcount": 46, > > "sum": 48.652836368, > > "avglat": 1057.670356 > > } > > as shown, average peering latency is 1008ms, most of them are consumed > by "waitupthru_latency". By looking at the codes, i am not quite > understand this part. Can anyone explain this part especially why it > takes such long time in this stage? Could you open a tracker for this issue and provide logs with "debug osd = 20" from both the primary and the replica osds? As Greg has mentioned this is not a measure of local work so it's important to look at all of the osds involved. > > I also noticed there's some description regarding "fast peering" on > http://tracker.ceph.com/projects/ceph/wiki/Osd_-_Faster_Peering > > is this still ongoing or stale? > > I am appreciated for any kind reply. > > Thanks, > Sheng > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Cheers, Brad -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html