Sorry for the noise. I have find out the cause in our setup and case: We gathered too many logs in our RADOS IO path, and the latency seems to be reasonable(about 0.026 ms) if we don't gather that many logs... 2015-08-05 20:29 GMT+08:00 Sage Weil <sage@xxxxxxxxxxxx>: > On Wed, 5 Aug 2015, Ding Dinghua wrote: >> 2015-08-05 0:13 GMT+08:00 Somnath Roy <Somnath.Roy@xxxxxxxxxxx>: >> > Yes, it has to re-acquire pg_lock today.. >> > But, between journal write and initiating the ondisk ack, there is one context switche in the code path. So, I guess the pg_lock is not the only one that is causing this 1 ms delay... >> > Not sure increasing the finisher threads will help in the pg_lock case as it will be more or less serialized by this pg_lock.. >> My concern is, if pg lock of pg A has been grabbed, not only ondisk >> callback of pg A is delayed, since ondisk_finisher has only one >> thread, ondisk callback of other pgs will be delayed too. > > I wonder if an optimistic approach might help here by making the > completion synchronous and doing something like > > if (pg->lock.TryLock()) { > pg->_finish_thing(completion->op); > delete completion; > } else { > finisher.queue(completion); > } > > or whatever. We'd need to ensure that we aren't holding any lock or > throttle budget that the pg could deadlock against. > > sage -- Ding Dinghua -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html