Re: Fwd: mds port 6800 socket closed when accessing mount point

Sage Weil <sage@xxxxxxxxxxxx> · Wed, 27 Oct 2010 17:16:11 -0700 (PDT)

Hi Yonggang,

Sorry, I misread the debug output.  You're right, the pid's are off by one 
because the daemons do a fork() and exit, and it's the child that survives 
and stays active.

In any case, the source of the offending message is

192.168.89.131:0/753148386

The large number at the end is just random.  I'm guessing that's the 
kernel client mounting/trying to mount the file system.  What kernel 
version, or version of the kernel module are you using?

Thanks-
sage

On Wed, 27 Oct 2010, Yonggang Liu wrote:

> ---------- Forwarded message ----------
> From: Yonggang Liu <myidpt@xxxxxxxxx>
> Date: Wed, Oct 27, 2010 at 7:32 PM
> Subject: Re: mds port 6800 socket closed when accessing mount point
> To: Sage Weil <sage@xxxxxxxxxxxx>
> 
> 
> Hi Sage,
> 
> Thank you very much for your reply. In your question, the PID 1978 is
> not a running process; it is the PID Ceph considers the mds0 process
> is using. But the real PID of mds0 was not that one (seem to be 1979).
> I redid the process and gathered the results. I think I see a little
> bit of the problem now. When I'm starting Ceph, the program prints out
> the PIDs of the daemons running on each machine, but these PIDs are
> not the exact PIDs of the real running daemons.
> 
> I started the system and got the PID information:
> mds0:~# /etc/init.d/ceph -a start
> === mon.0 ===
> Starting Ceph mon0 on mds0...
>  ** WARNING: Ceph is still under heavy development, and is only suitable for **
>  **          testing and review.  Do not trust it with important data.       **
> starting mon.0 rank 0 at 192.168.89.133:6789/0 mon_data /data/mon0
> fsid 9ff93f53-5e63-011c-4437-8e3f780cfcde
> === mds.0 ===
> Starting Ceph mds0 on mds0...
>  ** WARNING: Ceph is still under heavy development, and is only suitable for **
>  **          testing and review.  Do not trust it with important data.       **
> starting mds.0 at 0.0.0.0:6800/1824
> === osd.0 ===
> Starting Ceph osd0 on osd0...
>  ** WARNING: Ceph is still under heavy development, and is only suitable for **
>  **          testing and review.  Do not trust it with important data.       **
> starting osd0 at 0.0.0.0:6800/1622 osd_data /data/osd0 /data/osd0/journal
> === osd.1 ===
> Starting Ceph osd1 on osd1...
>  ** WARNING: Ceph is still under heavy development, and is only suitable for **
>  **          testing and review.  Do not trust it with important data.       **
> starting osd1 at 0.0.0.0:6800/1616 osd_data /data/osd1 /data/osd1/journal
> 
> So from the print, the PIDs known by Ceph are:
> moon.0       0
> mds.0         1824
> osd.0          1622
> osd.1          1616
> 
> But I checked the PIDs by using "ps uax | grep ceph" on the 3
> machines, I got this:
> On mds0:
> root      1792  1.0  1.2  46412  3260 ?        S<sl 20:14   0:07
> /usr/bin/cmon -i 0 -c /etc/ceph/ceph.conf
> root      1825 28.3  1.2  72228  3076 ?        S<sl 20:14   3:06
> /usr/bin/cmds -i 0 -c /etc/ceph/ceph.conf
> On osd0:
> root      1623  0.9  8.3 214820 21356 ?        S<sl 20:07   0:01
> /usr/bin/cosd -i 0 -c /tmp/ceph.conf.1747
> On osd1:
> root      1617  0.7  7.5 213780 19264 ?        S<sl 20:07   0:01
> /usr/bin/cosd -i 1 -c /tmp/ceph.conf.1747
> (Note the second fields are the PIDs)
> 
> Run "netstat -anp | grep 6800" on mds0, I got:
> mds0:~# netstat -anp | grep 6800
> tcp        0      0 0.0.0.0:6800            0.0.0.0:*
> LISTEN      1825/cmds
> tcp        0      0 192.168.89.133:33216    192.168.89.135:6800
> ESTABLISHED 1825/cmds
> tcp        0      0 192.168.89.133:56908    192.168.89.134:6800
> ESTABLISHED 1825/cmds
> tcp        0      0 192.168.89.133:6800     192.168.89.131:45334
> ESTABLISHED 1825/cmds
> (I can see PID 1825 is taking care of the client - mds0 communication)
> 
> At the end of file "mds.0.log", I saw the PID 1824:
> 2010-10-27 20:57:45.537898 b35e4b70 failed to decode message of type
> 784 v2: buffer::end_of_buffer
> 2010-10-27 20:57:45.538652 b35e4b70 -- 192.168.89.133:6800/1824 >>
> 192.168.89.131:0/753148386 pipe(0x9f43488 sd=-1 pgs=9 cs=7 l=0).fault
> with nothing to send, going to standby
> 2010-10-27 20:58:18.044950 b30fdb70 -- 192.168.89.133:6800/1824 >>
> 192.168.89.131:0/753148386 pipe(0x9f7cca8 sd=10 pgs=0 cs=0 l=0).accept
> peer addr is really 192.168.89.131:0/753148386 (socket is
> 192.168.89.131:45342/0)
> 2010-10-27 20:58:18.045247 b30fdb70 -- 192.168.89.133:6800/1824 >>
> 192.168.89.131:0/753148386 pipe(0x9f7cca8 sd=10 pgs=0 cs=0 l=0).accept
> connect_seq 7 vs existing 7 state 3
> 2010-10-27 20:58:18.053628 b30fdb70 failed to decode message of type
> 784 v2: buffer::end_of_buffer
> 2010-10-27 20:58:18.054824 b30fdb70 -- 192.168.89.133:6800/1824 >>
> 192.168.89.131:0/753148386 pipe(0x9f7cca8 sd=-1 pgs=10 cs=8 l=0).fault
> initiating reconnect
> 
> So this is strange. It seems the PIDs known by Ceph are not the real
> PIDs of the processes. I don't know how this could happen.
> 
> The version of the Ceph code is:
> mds0:~# ceph --version
> ceph version 0.22.1 (commit:7464f9688001aa89f9673ba14e6d075d0ee33541)
> 
> Thank you very much!
> 
> 
> On Wed, Oct 27, 2010 at 4:47 PM, Sage Weil <sage@xxxxxxxxxxxx> wrote:
> > Hi,
> >
> > On Wed, 27 Oct 2010, Yonggang Liu wrote:
> >> 2010-10-27 15:56:23.332473 b352eb70 failed to decode message of type
> >> 784 v4865: buffer::end_of_buffer
> >
> > This is the problem.  Can you tell what 192.168.89.133:6800/1978 is?
> > (That 1978 is a pid, btw.)  It's probably a 'cfuse' I'm guessing?  (The
> > ceph daemons all start with 'c', but don't have 'ceph' in the name.)
> >
> > 784 is a client_caps message. The 'v4865' looks like an uninitialized
> > variable to me, as it should be a small integer.
> >
> > Whatever daemon that is, can you run it with the '-v' argument to see
> > exactly which version it is?
> >
> > Thanks!
> > sage
> >
> >
> >
> >
> >> 2010-10-27 15:56:23.332730 b352eb70 -- 192.168.89.133:6800/1978 >>
> >> 192.168.89.131:0/1344076202 pipe(0x9fd0168 sd=-1 pgs=10 cs=8
> >> l=0).fault with nothing to send, going to standby
> >> 2010-10-27 15:56:52.022573 b332cb70 -- 192.168.89.133:6800/1978 >>
> >> 192.168.89.131:0/1344076202 pipe(0x9fd1720 sd=12 pgs=0 cs=0
> >> l=0).accept peer addr is really 192.168.89.131:0/1344076202 (socket is
> >> 192.168.89.131:41726/0)
> >> 2010-10-27 15:56:52.022945 b332cb70 -- 192.168.89.133:6800/1978 >>
> >> 192.168.89.131:0/1344076202 pipe(0x9fd1720 sd=12 pgs=0 cs=0
> >> l=0).accept connect_seq 8 vs existing 8 state 3
> >> 2010-10-27 15:56:52.025299 b332cb70 failed to decode message of type
> >> 784 v4865: buffer::end_of_buffer
> >> 2010-10-27 15:56:52.025569 b332cb70 -- 192.168.89.133:6800/1978 >>
> >> 192.168.89.131:0/1344076202 pipe(0x9fd1720 sd=-1 pgs=11 cs=9
> >> l=0).fault with nothing to send, going to standby
> >> 2010-10-27 15:57:48.106337 b362fb70 -- 192.168.89.133:6800/1978 >>
> >> 192.168.89.131:0/1344076202 pipe(0x9fd07c0 sd=12 pgs=0 cs=0
> >> l=0).accept peer addr is really 192.168.89.131:0/1344076202 (socket is
> >> 192.168.89.131:41727/0)
> >> 2010-10-27 15:57:48.106522 b362fb70 -- 192.168.89.133:6800/1978 >>
> >> 192.168.89.131:0/1344076202 pipe(0x9fd07c0 sd=12 pgs=0 cs=0
> >> l=0).accept connect_seq 9 vs existing 9 state 3
> >> 2010-10-27 15:57:48.109498 b362fb70 failed to decode message of type
> >> 784 v4865: buffer::end_of_buffer
> >> 2010-10-27 15:57:48.109761 b362fb70 -- 192.168.89.133:6800/1978 >>
> >> 192.168.89.131:0/1344076202 pipe(0x9fd07c0 sd=-1 pgs=12 cs=10
> >> l=0).fault with nothing to send, going to standby
> >>
> >> On osd0
> >> osd.0.log:
> >> 2010-10-27 15:55:01.470502 --- 1633 opened log /var/log/ceph/osd.0.log ---
> >> ceph version 0.22.1 (commit:7464f9688001aa89f9673ba14e6d075d0ee33541)
> >> 2010-10-27 15:55:01.485828 b72d38e0 filestore(/data/osd0) mkfs in /data/osd0
> >> 2010-10-27 15:55:01.486106 b72d38e0 filestore(/data/osd0) mkfs
> >> removing old file fsid
> >> 2010-10-27 15:55:01.516527 b72d38e0 filestore(/data/osd0) mkjournal
> >> created journal on /journal
> >> 2010-10-27 15:55:01.516734 b72d38e0 filestore(/data/osd0) mkfs done in
> >> /data/osd0
> >> 2010-10-27 15:55:01.519606 b72d38e0 filestore(/data/osd0) mount did
> >> NOT detect btrfs
> >> 2010-10-27 15:55:01.519794 b72d38e0 filestore(/data/osd0) mount found snaps <>
> >> 2010-10-27 15:55:01.548307 b5acfb70 FileStore::op_tp worker finish
> >> 2010-10-27 15:55:01.548455 b52ceb70 FileStore::op_tp worker finish
> >> 2010-10-27 15:55:01.548812 b72d38e0 journal close /journal
> >> 2010-10-27 15:55:25.201288 --- 1656 opened log /var/log/ceph/osd.0.log ---
> >> ceph version 0.22.1 (commit:7464f9688001aa89f9673ba14e6d075d0ee33541)
> >> 2010-10-27 15:55:25.223381 b74508e0 filestore(/data/osd0) mount did
> >> NOT detect btrfs
> >> 2010-10-27 15:55:25.224253 b74508e0 filestore(/data/osd0) mount found snaps <>
> >> 2010-10-27 15:55:25.225396 b74508e0 journal read_entry 4096 : seq 1 203 bytes
> >> 2010-10-27 15:55:26.841749 abffeb70 -- 0.0.0.0:6801/1656 >>
> >> 192.168.89.135:6801/1675 pipe(0x94ced18 sd=13 pgs=0 cs=0 l=0).connect
> >> claims to be 0.0.0.0:6801/1675 not 192.168.89.135:6801/1675 -
> >> presumably this is the same node!
> >> 2010-10-27 15:56:04.219613 abaf9b70 -- 192.168.89.134:6800/1656 >>
> >> 192.168.89.131:0/1344076202 pipe(0x96463e0 sd=16 pgs=0 cs=0
> >> l=0).accept peer addr is really 192.168.89.131:0/1344076202 (socket is
> >> 192.168.89.131:36746/0)
> >>
> >> On osd1
> >> osd.1.log:
> >> 2010-10-27 15:54:59.752615 --- 1652 opened log /var/log/ceph/osd.1.log ---
> >> ceph version 0.22.1 (commit:7464f9688001aa89f9673ba14e6d075d0ee33541)
> >> 2010-10-27 15:54:59.766128 b73518e0 filestore(/data/osd1) mkfs in /data/osd1
> >> 2010-10-27 15:54:59.766658 b73518e0 filestore(/data/osd1) mkfs
> >> removing old file fsid
> >> 2010-10-27 15:54:59.796938 b73518e0 filestore(/data/osd1) mkjournal
> >> created journal on /journal
> >> 2010-10-27 15:54:59.797816 b73518e0 filestore(/data/osd1) mkfs done in
> >> /data/osd1
> >> 2010-10-27 15:54:59.800957 b73518e0 filestore(/data/osd1) mount did
> >> NOT detect btrfs
> >> 2010-10-27 15:54:59.801087 b73518e0 filestore(/data/osd1) mount found snaps <>
> >> 2010-10-27 15:54:59.832202 b534cb70 FileStore::op_tp worker finish
> >> 2010-10-27 15:54:59.832504 b5b4db70 FileStore::op_tp worker finish
> >> 2010-10-27 15:54:59.832723 b73518e0 journal close /journal
> >> 2010-10-27 15:55:23.050042 --- 1675 opened log /var/log/ceph/osd.1.log ---
> >> ceph version 0.22.1 (commit:7464f9688001aa89f9673ba14e6d075d0ee33541)
> >> 2010-10-27 15:55:23.056671 b72e18e0 filestore(/data/osd1) mount did
> >> NOT detect btrfs
> >> 2010-10-27 15:55:23.056921 b72e18e0 filestore(/data/osd1) mount found snaps <>
> >> 2010-10-27 15:55:23.057368 b72e18e0 journal read_entry 4096 : seq 1 203 bytes
> >> 2010-10-27 15:55:23.207216 b12d4b70 osd1 2 map says i am down or have
> >> a different address.  switching to boot state.
> >> 2010-10-27 15:55:23.207540 b12d4b70 log [WRN] : map e2 wrongly marked me down
> >>
> >> I noticed on osd1, the last two lines of osd.1.log are unusual, but
> >> I'm not sure if they are the reason to the problem ...
> >>
> >> Thank you very much,
> >>
> >>
> >> On Wed, Oct 27, 2010 at 1:10 PM, Colin McCabe <cmccabe@xxxxxxxxxxxxxx> wrote:
> >> > Hi Yonggang,
> >> >
> >> > Are all of the daemons still running? What is at the end of the logfiles?
> >> >
> >> > regards,
> >> > Colin
> >> >
> >> >
> >> > On Wed, Oct 27, 2010 at 9:42 AM, Yonggang Liu <myidpt@xxxxxxxxx> wrote:
> >> >> Hello,
> >> >>
> >> >> I'm totally new to Ceph. Last a few days I set up 4 VMs to run Ceph:
> >> >> "mds0" for the metadata server and monitor, "osd0" and "osd1" for two
> >> >> data servers, and "client" for the client machine. The VMs are running
> >> >> Debian 5.0 with kernel 2.6.32-5-686 (Ceph module is enabled).
> >> >> I followed "Building kernel client" and "Debian" from the wiki, and I
> >> >> was able to start Ceph and mount Ceph at the client. But the problem
> >> >> is, the mounted point always fail with an infinite response time
> >> >> (after I mount Ceph for about 1 min or less). To illustrate it better,
> >> >> I will show you the information I got on the client and mds0 machines:
> >> >>
> >> >> mds0 (192.168.89.133):
> >> >> debian:~# mkcephfs -c /etc/ceph/ceph.conf --allhosts -v
> >> >> (A lot of info)
> >> >> debian:~# /etc/init.d/ceph -a start
> >> >> (some info)
> >> >>
> >> >> client (192.168.89.131):
> >> >> debian:~# mount -t ceph 192.168.89.133:/ /ceph
> >> >> debian:~# cd /ceph
> >> >> debian:/ceph# cp ~/app_ch.xls .
> >> >> debian:/ceph# ls
> >> >> (waiting for ever)
> >> >> ^C
> >> >>
> >> >> After the failure I ran dmesg at the client side and got:
> >> >> client (192.168.89.131):
> >> >> debian:/ceph# dmesg -c
> >> >> [  636.664425] ceph: loaded (mon/mds/osd proto 15/32/24, osdmap 5/5 5/5)
> >> >> [  636.694973] ceph: client4100 fsid 423ad64c-bbf0-3011-bb47-36a89f8787c6
> >> >> [  636.700716] ceph: mon0 192.168.89.133:6789 session established
> >> >> [  664.114551] ceph: mds0 192.168.89.133:6800 socket closed
> >> >> [  664.848722] ceph: mds0 192.168.89.133:6800 socket closed
> >> >> [  665.914923] ceph: mds0 192.168.89.133:6800 socket closed
> >> >> [  667.840396] ceph: mds0 192.168.89.133:6800 socket closed
> >> >> [  672.054106] ceph: mds0 192.168.89.133:6800 socket closed
> >> >> [  680.894531] ceph: mds0 192.168.89.133:6800 socket closed
> >> >> [  696.928496] ceph: mds0 192.168.89.133:6800 socket closed
> >> >> [  720.171754] ceph: mds0 caps stale
> >> >> [  728.999701] ceph: mds0 192.168.89.133:6800 socket closed
> >> >> [  794.640943] ceph: mds0 192.168.89.133:6800 socket closed
> >> >>
> >> >> Immediately after the failure, I ran netstat at mds0:
> >> >> mds0 (192.168.89.133):
> >> >> debian:~# netstat -anp
> >> >> Active Internet connections (servers and established)
> >> >> Proto Recv-Q Send-Q Local Address           Foreign Address
> >> >> State       PID/Program name
> >> >> tcp        0      0 0.0.0.0:6800            0.0.0.0:*
> >> >> LISTEN      1889/cmds
> >> >> tcp        0      0 0.0.0.0:22              0.0.0.0:*
> >> >> LISTEN      1529/sshd
> >> >> tcp        0      0 192.168.89.133:6789     0.0.0.0:*
> >> >> LISTEN      1840/cmon
> >> >> tcp        0      0 192.168.89.133:6789     192.168.89.131:56855
> >> >> ESTABLISHED 1840/cmon
> >> >> tcp        0      0 192.168.89.133:43647    192.168.89.133:6789
> >> >> ESTABLISHED 1889/cmds
> >> >> tcp        0      0 192.168.89.133:22       192.168.89.1:58304
> >> >> ESTABLISHED 1530/0
> >> >> tcp        0      0 192.168.89.133:39826    192.168.89.134:6800
> >> >> ESTABLISHED 1889/cmds
> >> >> tcp        0      0 192.168.89.133:6789     192.168.89.134:41289
> >> >> ESTABLISHED 1840/cmon
> >> >> tcp        0      0 192.168.89.133:6800     192.168.89.131:52814
> >> >> TIME_WAIT   -
> >> >> tcp        0      0 192.168.89.133:6789     192.168.89.135:41021
> >> >> ESTABLISHED 1840/cmon
> >> >> tcp        0      0 192.168.89.133:42069    192.168.89.135:6800
> >> >> ESTABLISHED 1889/cmds
> >> >> tcp        0      0 192.168.89.133:6789     192.168.89.133:43647
> >> >> ESTABLISHED 1840/cmon
> >> >> tcp        0      0 192.168.89.133:6800     192.168.89.131:52815
> >> >> TIME_WAIT   -
> >> >> tcp        0      0 192.168.89.133:6800     192.168.89.131:52816
> >> >> TIME_WAIT   -
> >> >> tcp6       0      0 :::22                   :::*
> >> >> LISTEN      1529/sshd
> >> >> udp        0      0 0.0.0.0:68              0.0.0.0:*
> >> >>         1490/dhclient3
> >> >> Active UNIX domain sockets (servers and established)
> >> >> Proto RefCnt Flags       Type       State         I-Node   PID/Program
> >> >> name    Path
> >> >> unix  2      [ ]         DGRAM                    2972     546/udevd
> >> >>        @/org/kernel/udev/udevd
> >> >> unix  4      [ ]         DGRAM                    5343
> >> >> 1358/rsyslogd       /dev/log
> >> >> unix  2      [ ]         DGRAM                    5662     1530/0
> >> >> unix  2      [ ]         DGRAM                    5486     1490/dhclient3
> >> >> debian:~#
> >> >> debian:~# dmesg -c
> >> >> debian:~# (nothing shows up)
> >> >>
> >> >> I saw the port 6800 on the metadata server talking with the client is
> >> >> on "TIME_WAIT" stage. That means the connection is closed.
> >> >> This is the ceph.conf I have:
> >> >> [global]
> >> >>       pid file = /var/run/ceph/$type.$id.pid
> >> >> [mon]
> >> >>       mon data = /data/mon$id
> >> >>       mon subscribe interval = 6000
> >> >>       mon osd down out interval = 6000
> >> >> [mon0]
> >> >>       host = mds0
> >> >>       mon addr = 192.168.89.133:6789
> >> >> [mds]
> >> >>       mds session timeout = 6000
> >> >>       mds session autoclose = 6000
> >> >>       mds client lease = 6000
> >> >>       keyring = /data/keyring.$name
> >> >> [mds0]
> >> >>       host = mds0
> >> >> [osd]
> >> >>       sudo = true
> >> >>       osd data = /data/osd$id
> >> >>       osd journal = /journal
> >> >>       osd journal size = 1024
> >> >>       filestore journal writeahead = true
> >> >> [osd0]
> >> >>       host = osd0
> >> >> [osd1]
> >> >>       host = osd1
> >> >> [group everyone]
> >> >>       addr = 0.0.0.0/0
> >> >> [mount]
> >> >>       allow = %everyone
> >> >> ;-----------------------------------end-----------------------------------
> >> >>
> >> >> The Ceph version I was using is 0.22.1.
> >> >>
> >> >> Can anyone help me to solve this problem? Thanks in advance!
> >> >> --
> >> >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> >> >> the body of a message to majordomo@xxxxxxxxxxxxxxx
> >> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >> >>
> >> >
> >>
> >>
> >>
> >> --
> >> Yonggang Liu
> >> Advanced Computing and Information Systems Laboratory
> >> University of Florida
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> >> the body of a message to majordomo@xxxxxxxxxxxxxxx
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>
> >>
> 
> 
> 
> --
> Yonggang Liu
> Advanced Computing and Information Systems Laboratory
> University of Florida
> 
> 
> 
> -- 
> Yonggang Liu
> Advanced Computing and Information Systems Laboratory
> University of Florida
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
>