Re: Log reading/how do I tell what an OSD is trying to connect to

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Nov 11, 2014 at 6:28 PM, Scott Laird <scott@xxxxxxxxxxx> wrote:
> I'm having a problem with my cluster.  It's running 0.87 right now, but I
> saw the same behavior with 0.80.5 and 0.80.7.
>
> The problem is that my logs are filling up with "replacing existing (lossy)
> channel" log lines (see below), to the point where I'm filling drives to
> 100% almost daily just with logs.
>
> It doesn't appear to be network related, because it happens even when
> talking to other OSDs on the same host.

Well, that means it's probably not physical network related, but there
can still be plenty wrong with the networking stack... ;)

> The logs pretty much all point to
> port 0 on the remote end.  Is this an indicator that it's failing to resolve
> port numbers somehow, or is this normal at this point in connection setup?

That's definitely unusual, but I'd need to see a little more to be
sure if it's bad. My guess is that these pipes are connections from
the other OSD's Objecter, which is treated as a regular client and
doesn't bind to a socket for incoming connections.

The repetitive channel replacements are concerning, though — they can
be harmless in some circumstances but this looks more like the
connection is simply failing to establish and so it's retrying over
and over again. Can you restart the OSDs with "debug ms = 10" in their
config file and post the logs somewhere? (There is not really any
documentation available on what they mean, but the deeper detail ones
might also be more understandable to you.)
-Greg

>
> The systems that are causing this problem are somewhat unusual; they're
> running OSDs in Docker containers, but they *should* be configured to run as
> root and have full access to the host's network stack.  They manage to work,
> mostly, but things are still really flaky.
>
> Also, is there documentation on what the various fields mean, short of
> digging through the source?  And how does Ceph resolve OSD numbers into
> host/port addresses?
>
>
> 2014-11-12 01:50:40.802604 7f7828db8700  0 -- 10.2.0.36:6819/1 >>
> 10.2.0.36:0/1 pipe(0x1ce31c80 sd=135 :6819 s=0 pgs=0 cs=0 l=1
> c=0x1e070580).accept replacing existing (lossy) channel (new one lossy=1)
>
> 2014-11-12 01:50:40.802708 7f7816538700  0 -- 10.2.0.36:6830/1 >>
> 10.2.0.36:0/1 pipe(0x1ff61080 sd=120 :6830 s=0 pgs=0 cs=0 l=1
> c=0x1f3db2e0).accept replacing existing (lossy) channel (new one lossy=1)
>
> 2014-11-12 01:50:40.803346 7f781ba8d700  0 -- 10.2.0.36:6819/1 >>
> 10.2.0.36:0/1 pipe(0x1ce31180 sd=125 :6819 s=0 pgs=0 cs=0 l=1
> c=0x1e070420).accept replacing existing (lossy) channel (new one lossy=1)
>
> 2014-11-12 01:50:40.803944 7f781996c700  0 -- 10.2.0.36:6830/1 >>
> 10.2.0.36:0/1 pipe(0x1ff618c0 sd=107 :6830 s=0 pgs=0 cs=0 l=1
> c=0x1f3d8420).accept replacing existing (lossy) channel (new one lossy=1)
>
> 2014-11-12 01:50:40.804185 7f7816538700  0 -- 10.2.0.36:6819/1 >>
> 10.2.0.36:0/1 pipe(0x1ffd1e40 sd=20 :6819 s=0 pgs=0 cs=0 l=1
> c=0x1e070840).accept replacing existing (lossy) channel (new one lossy=1)
>
> 2014-11-12 01:50:40.805235 7f7813407700  0 -- 10.2.0.36:6819/1 >>
> 10.2.0.36:0/1 pipe(0x1ffd1340 sd=60 :6819 s=0 pgs=0 cs=0 l=1
> c=0x1b2d6260).accept replacing existing (lossy) channel (new one lossy=1)
>
> 2014-11-12 01:50:40.806364 7f781bc8f700  0 -- 10.2.0.36:6819/1 >>
> 10.2.0.36:0/1 pipe(0x1ffd0b00 sd=162 :6819 s=0 pgs=0 cs=0 l=1
> c=0x675c580).accept replacing existing (lossy) channel (new one lossy=1)
>
> 2014-11-12 01:50:40.806425 7f781aa7d700  0 -- 10.2.0.36:6830/1 >>
> 10.2.0.36:0/1 pipe(0x1db29600 sd=143 :6830 s=0 pgs=0 cs=0 l=1
> c=0x1f3d9600).accept replacing existing (lossy) channel (new one lossy=1)
>
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com





[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux