> I finally got around to looking at the dump_historic_ops output for the > 1-client and 2-client cases. > As you recall these are all read-ops. so the events in the dump are > initiated > reached_pg > started > done > > The pattern I see for most of the slow ops recorded in the dump is: > > * In the 1-client case the typical slow op has duration between 50-65 ms > and usually most of this is the interval between reached_pg and started. > > * In the 2-client case the typical slow op has duration between 95-120 ms > and again usually most of this is the interval between reached_pg > and started. > > Could someone describe what the interval between reached_pg and started > means? I think the slow part is probably find_object_context() (although to be fair tons of stuff happens here, see do_op()). You could test this theory or otherwise narrow this down with additional event markers lke diff --git a/src/osd/ReplicatedPG.cc b/src/osd/ReplicatedPG.cc index d6f3084..6faccc2 100644 --- a/src/osd/ReplicatedPG.cc +++ b/src/osd/ReplicatedPG.cc @@ -1691,10 +1691,12 @@ void ReplicatedPG::do_op(OpRequestRef& op) return; } + op->mark_event("about to find"); int r = find_object_context( oid, &obc, can_create, m->has_flag(CEPH_OSD_FLAG_MAP_SNAP_CLONE), &missing_oid); + op->mark_event("found"); if (r == -EAGAIN) { // If we're not the primary of this OSD, and we have sage -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html