Log reading/how do I tell what an OSD is trying to connect to

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I'm having a problem with my cluster.  It's running 0.87 right now, but I saw the same behavior with 0.80.5 and 0.80.7.

The problem is that my logs are filling up with "replacing existing (lossy) channel" log lines (see below), to the point where I'm filling drives to 100% almost daily just with logs.

It doesn't appear to be network related, because it happens even when talking to other OSDs on the same host.  The logs pretty much all point to port 0 on the remote end.  Is this an indicator that it's failing to resolve port numbers somehow, or is this normal at this point in connection setup?

The systems that are causing this problem are somewhat unusual; they're running OSDs in Docker containers, but they *should* be configured to run as root and have full access to the host's network stack.  They manage to work, mostly, but things are still really flaky.

Also, is there documentation on what the various fields mean, short of digging through the source?  And how does Ceph resolve OSD numbers into host/port addresses?


2014-11-12 01:50:40.802604 7f7828db8700  0 -- 10.2.0.36:6819/1 >> 10.2.0.36:0/1 pipe(0x1ce31c80 sd=135 :6819 s=0 pgs=0 cs=0 l=1 c=0x1e070580).accept replacing existing (lossy) channel (new one lossy=1)

2014-11-12 01:50:40.802708 7f7816538700  0 -- 10.2.0.36:6830/1 >> 10.2.0.36:0/1 pipe(0x1ff61080 sd=120 :6830 s=0 pgs=0 cs=0 l=1 c=0x1f3db2e0).accept replacing existing (lossy) channel (new one lossy=1)

2014-11-12 01:50:40.803346 7f781ba8d700  0 -- 10.2.0.36:6819/1 >> 10.2.0.36:0/1 pipe(0x1ce31180 sd=125 :6819 s=0 pgs=0 cs=0 l=1 c=0x1e070420).accept replacing existing (lossy) channel (new one lossy=1)

2014-11-12 01:50:40.803944 7f781996c700  0 -- 10.2.0.36:6830/1 >> 10.2.0.36:0/1 pipe(0x1ff618c0 sd=107 :6830 s=0 pgs=0 cs=0 l=1 c=0x1f3d8420).accept replacing existing (lossy) channel (new one lossy=1)

2014-11-12 01:50:40.804185 7f7816538700  0 -- 10.2.0.36:6819/1 >> 10.2.0.36:0/1 pipe(0x1ffd1e40 sd=20 :6819 s=0 pgs=0 cs=0 l=1 c=0x1e070840).accept replacing existing (lossy) channel (new one lossy=1)

2014-11-12 01:50:40.805235 7f7813407700  0 -- 10.2.0.36:6819/1 >> 10.2.0.36:0/1 pipe(0x1ffd1340 sd=60 :6819 s=0 pgs=0 cs=0 l=1 c=0x1b2d6260).accept replacing existing (lossy) channel (new one lossy=1)

2014-11-12 01:50:40.806364 7f781bc8f700  0 -- 10.2.0.36:6819/1 >> 10.2.0.36:0/1 pipe(0x1ffd0b00 sd=162 :6819 s=0 pgs=0 cs=0 l=1 c=0x675c580).accept replacing existing (lossy) channel (new one lossy=1)

2014-11-12 01:50:40.806425 7f781aa7d700  0 -- 10.2.0.36:6830/1 >> 10.2.0.36:0/1 pipe(0x1db29600 sd=143 :6830 s=0 pgs=0 cs=0 l=1 c=0x1f3d9600).accept replacing existing (lossy) channel (new one lossy=1)


_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux