Re: IO Hang on rbd

Ilya Dryomov <ilya.dryomov@xxxxxxxxxxx> · Mon, 15 Dec 2014 16:20:23 +0300

On Thu, Dec 11, 2014 at 7:57 PM, reistlin87 <79026480913@xxxxxxxxx> wrote:
> Hi all!
>
> We have an annoying problem - when we launch intensive reading with rbd, the client, to which mounted image, hangs in this state:
>
> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
> sda               0.00     0.00    0.00    1.20     0.00     0.00     8.00     0.00    0.00    0.00    0.00   0.00   0.00
> dm-0              0.00     0.00    0.00    1.20     0.00     0.00     8.00     0.00    0.00    0.00    0.00   0.00   0.00
> dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
> rbd0              0.00     0.00    0.00    0.00     0.00     0.00     0.00    32.00    0.00    0.00    0.00   0.00 100.00
>
> Only  reboot helps. The logs are clean.
>
> The fastest way to get hang it is run fio read with block size 512K, 4K  usually works fine. But client may hang without fio - only because of heavy load.
>
> We used different versions of the linux kernel and ceph - now on OSD and MONS we use ceph 0.87-1 and linux kernel 3.18. On the clients we have tried the latest versions from here http://gitbuilder.ceph.com/. , for example Ceph  0.87-68. Through libvirt everything works fine - we also  use  KVM  and stgt (but stgs is slow)

Is there anything in dmesg around the time it hangs?

If possible, don't change anything about your config - number of osds,
number of pgs, pools, etc so you can reproduce with logging enabled.

Thanks,

                Ilya
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com