Data transfer occasionally freezes for significant time

aaaler@xxxxxxxxx · Fri, 19 Feb 2016 21:26:46 +0300

Hi All.

We're running 180-node cluster in docker containers -- official ceph:hammer.
Recently, we've found a rarely reproducible problem on it: sometimes
data transfer freezes for significant time (5-15 minutes). The issue
is taking place while using radosgw & librados apps
(docker-distribution). This problem can be worked around with "ms tcp
read timeout" parameter decreased to 2-3 seconds on the client side,
but that does not seem to be a good solution.
I've written bash script, getting every object (and it's omap/xattr)
with 'rados' cli utility  from data pool in infinite cycle, to
reproduce the problem. Running that on 3 hosts simultaneously on
docker-distribution's pool (4mb objects) during 8 hours resulted in 25
reads, each of them took more than 60 seconds.
Script results here (hostnames substituted):
https://gist.github.com/aaaler/cb190c1eb636564519a5#file-distribution-pool-err-sorted
But there's nothing suspicious on corresponding OSD logs.
For example, take a look on the one of these faulty reads:
 21:44:32 consumed 1891 seconds reading
blob:daa46e8d-170e-43ab-8c00-526782f95e02-0 on host1(192.168.1.133)
osdmap e80485 pool 'distribution' (17) object
'blob:daa46e8d-170e-43ab-8c00-526782f95e02-0' -> pg 17.97f485f (17.5f)
-> up ([139,149,167], p139) acting ([139,149,167], p139)

Thus, we've got 1891 seconds of waiting, and after that the client has
just proceed without any errors occurred, so I tried to find something
useful in osd.139 logs
(https://gist.github.com/aaaler/cb190c1eb636564519a5#file-osd-139-log),
but could not find anything interesting.

Another example (next line in script output) shew us 2983 seconds of
reading blob:f5c22093-6e6d-41a6-be36-462330b36c67-71 from osd.56.
Again, nothing in osd.56 logs during that time:
https://gist.github.com/aaaler/cb190c1eb636564519a5#file-osd-56-log

How can I troubleshoot this? As too excessive logging on 180-node
cluster will make bunch of traffic and bring problems with finding the
right host to check log :(

Few words about underlying configuration:
- ceph:hammer containers in docker 1.9.1 (--net=host)
- gentoo with 3.14.18/3.18.10 kernel.
- 1gbps LAN
- osd using directory in /var
- hosts share osd workload with some other php-fpm's

The configuration is pretty default, except some osd parameters
configured to reduce scrubbing workload:
[osd]
osd disk thread ioprio class = idle
osd disk thread ioprio priority = 5
osd recovery max active = 1
osd max backfills = 2

--
 Sincerely, Alexey Griazin
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com