Hi, Eric We are now testing the mclock priority queue that you contributed several days ago. Our test environment consists of four machines: one machine for the monitor and mgr daemon, two OSDs in each of the left three ones. mclock-related configurations are as the following: osd_op_queue = mclock_opclass osd_op_queue_mclock_client_op_res = 20000.0 osd_op_queue_mclock_client_op_wgt = 0.0 osd_op_queue_mclock_client_op_lim = 30000.0 osd_op_queue_mclock_recov_res = 0.0 osd_op_queue_mclock_recov_wgt = 0.0 osd_op_queue_mclock_recov_lim = 2000.0 When we kiledl one OSD daemon to test the effects of recovery on the client op, other OSDs crashed as well because of assert semantics: ceph version 12.0.3-2318-g32ab095 (32ab09536207b4b261874c0063b3275b97537045) luminous (dev) 1: (()+0x9e86b1) [0x7f4287e846b1] 2: (()+0xf100) [0x7f4284d18100] 3: (gsignal()+0x37) [0x7f4283d415f7] 4: (abort()+0x148) [0x7f4283d42ce8] 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x284) [0x7f4287ec2364] 6: (ceph::mClockQueue<std::pair<spg_t, PGQueueable>, ceph::mClockOpClassQueue::osd_op_type_t>::dequeue()+0x45f) [0x7f4287baac3f] 7: (ceph::mClockOpClassQueue::dequeue()+0xd) [0x7f4287baacfd] 8: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x314) [0x7f428798a174] 9: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x8e9) [0x7f4287ec7d39] 10: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x7f4287ec9ec0] 11: (()+0x7dc5) [0x7f4284d10dc5] 12: (clone()+0x6d) [0x7f4283e02ced] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. However, if the value of osd_op_queue was set to wpq or prio, it worked well. So do you know why the assert semantics was triggered. Thanks very much. 2017-06-30 2:03 GMT+08:00 J. Eric Ivancich <ivancich@xxxxxxxxxx>: > On 06/28/2017 02:55 PM, sheng qiu wrote: >> May I ask do you have any plan to implement client side logic for a >> true "D"mclock, seems to me know it's mclock on each individual OSD. >> And each client also has a common iops config. We are planing to >> working on that part and integrate with your current work. > > That is not a high priority in the short-term. Our main goal with > integrating dmclock/mclock was to better manage priorities among > operation classes. > > Developers at SK Telecom have done some work towards this, though. For > example, see here: > > https://github.com/ivancich/ceph/pull/1 > > Eric > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html