On 7/20/19 6:06 PM, Wei Zhao wrote: > Hi ceph users: > I was doing write benchmark, and found some io will be blocked for a > very long time. The following log is one op , it seems to wait for > replica to finish. My ceph version is 12.2.4, and the pool is 3+2 EC . > Does anyone give me some adives about how I sould debug next ? > > { > "ops": [ > { > "description": "osd_op(client.17985.0:670679 39.18 > 39:1a63fc5c:::benchmark_data_SH-IDC1-10-5-37-174_2917453_object670678:head > [set-alloc-hint object_size 1048576 write_size 1048576,write > 0~1048576] snapc 0=[] ondisk+write+known_if_redirected e1135)", > "initiated_at": "2019-07-20 23:13:18.725466", > "age": 329.248875, > "duration": 329.248901, > "type_data": { > "flag_point": "waiting for sub ops", > "client_info": { > "client": "client.17985", > "client_addr": "10.5.137.174:0/1544466091", > "tid": 670679 > }, > "events": [ > { > "time": "2019-07-20 23:13:18.725466", > "event": "initiated" > }, > { > "time": "2019-07-20 23:13:18.726585", > "event": "queued_for_pg" > }, > { > "time": "2019-07-20 23:13:18.726606", > "event": "reached_pg" > }, > { > "time": "2019-07-20 23:13:18.726752", > "event": "started" > }, > { > "time": "2019-07-20 23:13:18.726842", > "event": "waiting for subops from 4" > }, This usually indicates there is something going on with osd.4 I would go and see if osd.4 is very busy at that moment and check if the disk is 100% busy in iostat. It could be a number of things, but I would check that first. Wido > { > "time": "2019-07-20 23:13:18.743134", > "event": "op_commit" > }, > { > "time": "2019-07-20 23:13:18.743137", > "event": "op_applied" > } > ] > } > }, > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com