OSD 23 notes that object rbd_data.21aafa6b8b4567.0000000000000aaa iswaiting for a scrub. What happens if you run "rados -p <rbd pool> rmrbd_data.21aafa6b8b4567.0000000000000aaa" (capturing the OSD 23 logsduring this)? If that succeeds while your VM remains blocked on thatremove op, it looks like there is some problem in the OSD where opsqueued on a scrub are not properly awoken when the scrub completes.On Wed, May 17, 2017 at 10:57 AM, Stefan Priebe - Profihost AG<s.priebe@xxxxxxxxxxxx> wrote:Hello Jason,
after enabling the log and generating a gcore dump, the request was
successful ;-(
So the log only contains the successfull request. So i was only able to
catch the successful request. I can send you the log on request.
Luckily i had another VM on another Cluster behaving the same.
This time osd.23:
# ceph --admin-daemon
/var/run/ceph/ceph-client.admin.22969.140085040783360.asok
objecter_requests
{
"ops": [
{
"tid": 18777,
"pg": "2.cebed0aa",
"osd": 23,
"object_id": "rbd_data.21aafa6b8b4567.0000000000000aaa",
"object_locator": "@2",
"target_object_id": "rbd_data.21aafa6b8b4567.0000000000000aaa",
"target_object_locator": "@2",
"paused": 0,
"used_replica": 0,
"precalc_pgid": 0,
"last_sent": "1.83513e+06s",
"attempts": 1,
"snapid": "head",
"snap_context": "28a43=[]",
"mtime": "2017-05-17 16:51:06.0.455475s",
"osd_ops": [
"delete"
]
}
],
"linger_ops": [
{
"linger_id": 1,
"pg": "2.f0709c34",
"osd": 23,
"object_id": "rbd_header.21aafa6b8b4567",
"object_locator": "@2",
"target_object_id": "rbd_header.21aafa6b8b4567",
"target_object_locator": "@2",
"paused": 0,
"used_replica": 0,
"precalc_pgid": 0,
"snapid": "head",
"registered": "1"
}
],
"pool_ops": [],
"pool_stat_ops": [],
"statfs_ops": [],
"command_ops": []
}
OSD Logfile of OSD 23 attached.
Greets,
Stefan
Am 17.05.2017 um 16:26 schrieb Jason Dillaman:
On Wed, May 17, 2017 at 10:21 AM, Stefan Priebe - Profihost AG
<s.priebe@xxxxxxxxxxxx> wrote:
You mean the request no matter if it is successful or not? Which log
level should be set to 20?
I'm hoping you can re-create the hung remove op when OSD logging is
increased -- "debug osd = 20" would be nice if you can turn it up that
high while attempting to capture the blocked op.
-- Jason
|