Does your ceph status show pg 2.cebed0aa (still) scrubbing? Sure -- I can quickly scan the new log if you directly send it to me. On Wed, May 17, 2017 at 2:18 PM, Stefan Priebe - Profihost AG <s.priebe@xxxxxxxxxxxx> wrote: > can send the osd log - if you want? > > Stefan > > Am 17.05.2017 um 20:13 schrieb Stefan Priebe - Profihost AG: >> Hello Jason, >> >> the command >> # rados -p cephstor6 rm rbd_data.21aafa6b8b4567.0000000000000aaa >> >> hangs as well. Doing absolutely nothing... waiting forever. >> >> Greets, >> Stefan >> >> Am 17.05.2017 um 17:05 schrieb Jason Dillaman: >>> OSD 23 notes that object rbd_data.21aafa6b8b4567.0000000000000aaa is >>> waiting for a scrub. What happens if you run "rados -p <rbd pool> rm >>> rbd_data.21aafa6b8b4567.0000000000000aaa" (capturing the OSD 23 logs >>> during this)? If that succeeds while your VM remains blocked on that >>> remove op, it looks like there is some problem in the OSD where ops >>> queued on a scrub are not properly awoken when the scrub completes. >>> >>> On Wed, May 17, 2017 at 10:57 AM, Stefan Priebe - Profihost AG >>> <s.priebe@xxxxxxxxxxxx> wrote: >>>> Hello Jason, >>>> >>>> after enabling the log and generating a gcore dump, the request was >>>> successful ;-( >>>> >>>> So the log only contains the successfull request. So i was only able to >>>> catch the successful request. I can send you the log on request. >>>> >>>> Luckily i had another VM on another Cluster behaving the same. >>>> >>>> This time osd.23: >>>> # ceph --admin-daemon >>>> /var/run/ceph/ceph-client.admin.22969.140085040783360.asok >>>> objecter_requests >>>> { >>>> "ops": [ >>>> { >>>> "tid": 18777, >>>> "pg": "2.cebed0aa", >>>> "osd": 23, >>>> "object_id": "rbd_data.21aafa6b8b4567.0000000000000aaa", >>>> "object_locator": "@2", >>>> "target_object_id": "rbd_data.21aafa6b8b4567.0000000000000aaa", >>>> "target_object_locator": "@2", >>>> "paused": 0, >>>> "used_replica": 0, >>>> "precalc_pgid": 0, >>>> "last_sent": "1.83513e+06s", >>>> "attempts": 1, >>>> "snapid": "head", >>>> "snap_context": "28a43=[]", >>>> "mtime": "2017-05-17 16:51:06.0.455475s", >>>> "osd_ops": [ >>>> "delete" >>>> ] >>>> } >>>> ], >>>> "linger_ops": [ >>>> { >>>> "linger_id": 1, >>>> "pg": "2.f0709c34", >>>> "osd": 23, >>>> "object_id": "rbd_header.21aafa6b8b4567", >>>> "object_locator": "@2", >>>> "target_object_id": "rbd_header.21aafa6b8b4567", >>>> "target_object_locator": "@2", >>>> "paused": 0, >>>> "used_replica": 0, >>>> "precalc_pgid": 0, >>>> "snapid": "head", >>>> "registered": "1" >>>> } >>>> ], >>>> "pool_ops": [], >>>> "pool_stat_ops": [], >>>> "statfs_ops": [], >>>> "command_ops": [] >>>> } >>>> >>>> OSD Logfile of OSD 23 attached. >>>> >>>> Greets, >>>> Stefan >>>> >>>> Am 17.05.2017 um 16:26 schrieb Jason Dillaman: >>>>> On Wed, May 17, 2017 at 10:21 AM, Stefan Priebe - Profihost AG >>>>> <s.priebe@xxxxxxxxxxxx> wrote: >>>>>> You mean the request no matter if it is successful or not? Which log >>>>>> level should be set to 20? >>>>> >>>>> >>>>> I'm hoping you can re-create the hung remove op when OSD logging is >>>>> increased -- "debug osd = 20" would be nice if you can turn it up that >>>>> high while attempting to capture the blocked op. >>>>> >>> >>> >>> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> -- Jason _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com