Yes, they can hold up reads to the same object. Depending on where
they're stuck, they may be blocking other requests as well if they're
e.g. taking up all the filestore threads. Waiting for subops means
they're waiting for replicas to acknowledge the write and commit it to
disk. The real cause for slowness of those ops is the replicas. If you
enable 'debug osd = 25', 'filestore = 25', and 'debug journal = 20' you
can trace through the logs to see exactly what's happening with the
subops for those requests.
Looks like I hit exactly the same issue as described in "Slow request
warnings on 0.48" but from different angle. As our client has run mysql
updates performance started to degrade across the cluster bringing the
rest of VMs to standstill and producing incredible latency. At some
point slow request warnings started to pop up and now it seems I cannot
get rid of them at all: I have shut down all clients, all ceph
subsystems, restarted everything and it is back to the same behaviour -
slow request warnings.
Before rebuilding osds I will enable debug as you suggested in attempt
to find underlying issue. Then will rebuild osds as a measure of last
resort to make sure that indeed osds causing the issue.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html