kernel:rbd:rbd0: encountered watch error: -10

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi !

I meet a confused case:

When write to cephfs and rbd at same time, after a while, rbd process is hang and i find:

kernel:rbd:rbd0: encountered watch error: -10

I try to reproduce with below action and succeed:

- run 2 dd process to write to cephfs
- do file write action on rbd

I find that lots of cpu are in iowait status, and lots of kernel process in D status.

I guess that:

- the process in the D state is mainly kswapd and writeback dirty page write-back thread process.
  when IO wait queue of the rbd disk is very long, then any process do IO operations on rbd disk,
  they need to be queued and  wait for a long time and in the D state, the kernel will automatically print out the call stack after more than 120s

- rbd hang since rbd client use watch-notify to communicate, when iowait stress is high, may do impact on it

- cephfs and rbd share network bandwidth, and we use 40GB IB for ceph, network speed is too faster than disk speed

Only workaround  i can think about is refresh page cache by crond, but it may result in performance degradation.

Could someone help me?

Why rbd hang and how can I fix?

I really want to use cephfs and rbd at same time, but this issue is so bad for production environment.

Thanks
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux