msgr bug in master caused by recent protocol refactor (?)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



In CephFS testing, we've observed transient failures caused by what
appears to messages being dropped [1,2]. These appear to have been
caused by the recent refactor PR [3,4] but I have no evidence other
than the problems appearing during testing with [4] after [4] was
merged.

I'm running tests [5] to see if I can get more debugging (debug ms =
20) but I wanted to canvas for ideas/advice before I get much deeper.
Has anyone else seen transient failures with messages getting dropped?

[1] http://tracker.ceph.com/issues/36389
[2] http://tracker.ceph.com/issues/36349
[3] https://github.com/ceph/ceph/pull/23415
[4] https://github.com/ceph/ceph/pull/24305
[5] http://pulpito.ceph.com/?branch=wip-pdonnell-testing-20181011.152759

-- 
Patrick Donnelly



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux