As no response were given, i will explain what i found : maybe it could help other people .dirXXXXXXX object is an index marker with a 0 data size. The metadata associated to this object (located in the levelDB of the OSDs currently holding this marker) is the index of the bucket corresponding to this marker. My problem came from the number of objects stored in this bucket : more than 50 millions. As the size of an object in the index is between 200 and 250 bytes, the index should have a 12 GB size. That's why it is recommanded to add a shard the index for each 100.000 objects. During a ceph process rebuild, some pgs move from some OSDs to others. When a index is moving, all the write requests to the bucket are blocked till the operation completed. During this move, the user had launched an upload batch on the bucket so a lot of requests were blocked, leading to block all the requests on the primary pgs hold by the OSD. So the loop i saw was in fact just normal and but moving a 12 GB object from one SATA to an other takes several minutes, to long in fact for a ceph cluster with a lot of clients to survive The lesson of this story is : Don't forget to shard your bucket !!! --------------------------------------------------------------------------------------------------------------- Yesterday we just encountered this bug. One OSD was looping on "2018-01-03 16:20:59.148121 7f011a6a1700 0 log_channel(cluster) log [WRN] : slow request 30.254269 seconds old, received at 2018-01-03 16:20:28.883837: osd_op(client.48285929.0:14601958 35.8abfc02e .dir.0a3e5369-ff79-4f7d-b0b6-79c5a75b1759.29113876.1 [call rgw.bucket_prepare_op] snapc 0=[] ondisk+write+known_if_redirected e359833) currently waiting for degraded object". The requests on this OSD.150 went quickly in blocked state 2018-01-03 16:25:56.241064 7f011a6a1700 0 log_channel(cluster) log [WRN] : 20 slow requests, 1 included below; oldest blocked for > 327.357139 secs 2018-01-03 16:30:19.299288 7f011a6a1700 0 log_channel(cluster) log [WRN] : 45 slow requests, 1 included below; oldest blocked for > 590.415387 secs ... ... 2018-01-03 16:46:04.900204 7f011a6a1700 0 log_channel(cluster) log [WRN] : 100 slow requests, 2 included below; oldest blocked for > 1204.060056 secs while still looping 2018-01-03 16:46:04.900220 7f011a6a1700 0 log_channel(cluster) log [WRN] : slow request 123.294762 seconds old, received at 2018-01-03 16:44:01.605320 : osd_op(client.48285929.0:14605228 35.8abfc02e .dir.0a3e5369-ff79-4f7d-b0b6-79c5a75b1759.29113876.1 [call rgw.bucket_complete_op] snapc 0=[] ack+ondisk+write+known_if_redirected e359833) currently waiting for degraded object All theses resquest were blocked on OSD.150. A lot of VMs attached to Ceph were hanging. The degraded object was .dir.0a3e5369-ff79-4f7d-b0b6-79c5a75b1759.29113876.1 in the pg 35.2e. This PG was located on 4 OSDs. The object has a 0 size on the 4 OSDs. It was not possible to do a ceph osd pg 35.2e query with a response. Killing the OSD.150 lead to the requests bloqued on the new primary. I found the relatively new bug #22072 which looks like mine but there was no response from the ceph team. I finally tried the same solution : rados rm -p pool/degraded_object but with no response from the command. I stopped the command after 15 mn. Few minutes later, the 4 OSDs holding the pg 35.2e suddenly rebooted and the problem was solved. The object was deleted on the 4 OSDs. Anyway, it leads to a production break and i have no idea of what produced the "degraded object" and i'm not sure if the solution came from my command or from a inside process. At this time we are still trying to repare some filesystems of the VMs attached to Ceph and i have to explain that this all production break comes from one empty object ... The real problem is why Ceph was unable to handle this "degraded object" and looped on it, blocking all the requests on the OSD.150 ? _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com