thanks Eric! yes, we found that "allow_limit_break" logic, and did a little hack to enable the logic when it's set to false. seems working as what we expected. May I ask do you have any plan to implement client side logic for a true "D"mclock, seems to me know it's mclock on each individual OSD. And each client also has a common iops config. We are planing to working on that part and integrate with your current work. Thanks, Sheng On Wed, Jun 28, 2017 at 11:33 AM, J. Eric Ivancich <ivancich@xxxxxxxxxx> wrote: > On 06/27/2017 05:21 PM, sheng qiu wrote: >> i am appreciated to your kind reply. >> >> In our test, we set the following in the ceph.conf: >> >> osd_op_queue = mclock_client >> osd_op_queue_cut_off = high >> osd_op_queue_mclock_client_op_lim = 100.0 >> osd_op_queue_mclock_client_op_res = 50.0 >> osd_op_num_shards = 1 >> osd_op_num_threads_per_shard = 1 >> >> >> in this setup, all io requests should go to one mclock_client queue >> and using the mclock scheduling (osd_op_queue_cut_off = high). >> we use fio for test, we set job=1, bs=4k, qd=1 or 16. >> >> we are expecting the visible iops by fio should < 100, while we see a >> much higher value. >> Did we understand your work correctly? or did we miss anything? > > Hi Sheng, > > I think you understand things well, but there is one additional detail > you may not have noticed yet. And that is what should be done when all > clients have reached their limit momentarily and the ObjectStore would > like another op to keep itself busy? We either a) refuse to provide it > with an op, or b) give it the op the op that's most appropriate by > weight. The ceph code currently is not designed to handle a) and it's > not even clear that we should starve the ObjectStore in that manner. So > we do b), and that means we can exceed the limit. > > dmclock's PullPriorityQueue constructors have a parameter > _allow_limit_break, which ceph sets to true. That is how we do b) above. > If you ever wanted to set that to false you'd need to make other changes > to the ObjectStore ceph code to handle cases with the op queue is not > empty but not ready/willing to return an op when one is requested. > > Eric -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html