Re: IO Hang on rbd

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Try lowering "filestore max sync interval" and "filestore min sync
interval". It looks like during the hanged period data is flushed from
some overly big buffer.

If this does not help you can monitor perf stats on OSDs to see if some
queue is unusually large.

-- 
Tomasz Kuzemko
tomasz.kuzemko@xxxxxxx

On Thu, Dec 11, 2014 at 07:57:48PM +0300, reistlin87 wrote:
> Hi all!
> 
> We have an annoying problem - when we launch intensive reading with rbd, the client, to which mounted image, hangs in this state:
> 
> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
> sda               0.00     0.00    0.00    1.20     0.00     0.00     8.00     0.00    0.00    0.00    0.00   0.00   0.00
> dm-0              0.00     0.00    0.00    1.20     0.00     0.00     8.00     0.00    0.00    0.00    0.00   0.00   0.00
> dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
> rbd0              0.00     0.00    0.00    0.00     0.00     0.00     0.00    32.00    0.00    0.00    0.00   0.00 100.00
> 
> Only  reboot helps. The logs are clean.
> 
> The fastest way to get hang it is run fio read with block size 512K, 4K  usually works fine. But client may hang without fio - only because of heavy load.
> 
> We used different versions of the linux kernel and ceph - now on OSD and MONS we use ceph 0.87-1 and linux kernel 3.18. On the clients we have tried the latest versions from here http://gitbuilder.ceph.com/. , for example Ceph  0.87-68. Through libvirt everything works fine - we also  use  KVM  and stgt (but stgs is slow)
> 
> Here is my config:
> [global]
>         fsid = 566d9cab-793e-47e0-a0cd-e5da09f8037a
>         mon_initial_members = srt-mon-001-000002,amz-mon-001-000601,db24-mon-001-000105
>         mon_host = 10.201.20.31,10.203.20.56,10.202.20.58
>         auth_cluster_required = cephx
>         auth_service_required = cephx
>         auth_client_required = cephx
>         filestore_xattr_use_omap = true
>         public network = 10.201.20.0/22
>         cluster network = 10.212.36.0/22
>         osd crush update on start = false
> [mon]
>         debug mon = 0
>         debug paxos = 0/0
>         debug auth = 0
> 
> [mon.srt-mon-001-000002]
>         host = srt-mon-001-000002
>         mon addr = 10.201.20.31:6789
> [mon.db24-mon-001-000105]
>         host = db24-mon-001-000105
>         mon addr = 10.202.20.58:6789
> [mon.amz-mon-001-000601]
>         host = amz-mon-001-000601
>         mon addr = 10.203.20.56:6789
> [osd]
>         osd crush update on start = false
>         osd mount options xfs = "rw,noatime,inode64,allocsize=4M"
>         osd mkfs type = xfs
>         osd mkfs options xfs = "-f -i size=2048"
>         osd op threads = 20
>         osd disk threads =8
>         journal block align = true
>         journal dio = true
>         journal aio = true
>         osd recovery max active = 1
>         filestore max sync interval = 100
>         filestore min sync interval = 10
>         filestore queue max ops = 2000
>         filestore queue max bytes = 536870912
>         filestore queue committing max ops = 2000
>         filestore queue committing max bytes = 536870912
>         osd max backfills = 1
>         osd client op priority = 63
> [osd.5]
>         host = srt-osd-001-050204
> [osd.6]
>         host = srt-osd-001-050204
> [osd.7]
>         host = srt-osd-001-050204
> [osd.8]
>         host = srt-osd-001-050204
> [osd.109]
> ....
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Attachment: signature.asc
Description: Digital signature

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux