Hi Eric, i am appreciated to your kind reply. In our test, we set the following in the ceph.conf: osd_op_queue = mclock_client osd_op_queue_cut_off = high osd_op_queue_mclock_client_op_lim = 100.0 osd_op_queue_mclock_client_op_res = 50.0 osd_op_num_shards = 1 osd_op_num_threads_per_shard = 1 in this setup, all io requests should go to one mclock_client queue and using the mclock scheduling (osd_op_queue_cut_off = high). we use fio for test, we set job=1, bs=4k, qd=1 or 16. we are expecting the visible iops by fio should < 100, while we see a much higher value. Did we understand your work correctly? or did we miss anything? Thanks, Sheng On Wed, Jun 21, 2017 at 2:04 PM, J. Eric Ivancich <ivancich@xxxxxxxxxx> wrote: > Hi Sheng, > > I'll interleave responses below. > > On 06/21/2017 01:38 PM, sheng qiu wrote: >> hi Eric, >> >> we are pretty interested in your dmclock integration work with CEPH. >> After reading your pull request, i am a little confusing. >> May i ask if the setting in config such as >> osd_op_queue_mclock_client_op_res functioning in your added dmclock's >> queues and their enqueue and dequeue methods? > > Yes, that (and related) configuration option is used. You'll see it > referenced in both src/osd/mClockOpClassQueue.cc and > src/osd/mClockClientQueue.cc. > > Let me answer for mClockOpClassQueue, but the process is similar in > mClockClientQueue. > > The configuration value is brought into an instance of > mClockOpClassQueue::mclock_op_tags_t. The variables > mClockOpClassQueue::mclock_op_tags holds a unique_ptr to a singleton of > that type. And then when a new operation is enqueued, the function > mClockOpClassQueue::op_class_client_info_f is called to determine its > mclock parameters at which time the value is used. > >> the below enqueue function insert request into a map<priority, >> subqueue>, i guess for mclock_opclass queue, you set high priority for >> client op and lower for scrub, recovery, etc. >> Within each subqueue of same priority, did you do FIFO? >> >> void enqueue_strict(K cl, unsigned priority, T item) override final { >> high_queue[priority].enqueue(cl, 0, item); >> } > > Yes, higher priority operations use a strict queue and lower priority > operations use mclock. That basic behavior was based on the two earlier > op queue implementations (src/common/WeightedPriorityQueue.h and > src/common/PrioritizedQueue.h). The priority value that's used as a > cut-off is determined by the configuration option osd_op_queue_cut_off > (which can be "low" or "high", which map to values CEPH_MSG_PRIO_LOW and > CEPH_MSG_PRIO_HIGH (defined in src/include/msgr.h); see function > OSD::get_io_prio_cut). > > And those operations that end up in the high queue are handled strictly > -- higher priorities before lower priorities. > >> I am appreciated if you can provide some comments, especially if i >> didn't understand correctly. > > I hope that's helpful. Please let me know if you have further questions. > > Eric -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html