Fwd: mds port 6800 socket closed when accessing mount point

Yonggang Liu <myidpt@xxxxxxxxx> · Wed, 27 Oct 2010 19:32:42 -0400

---------- Forwarded message ----------
From: Yonggang Liu <myidpt@xxxxxxxxx>
Date: Wed, Oct 27, 2010 at 7:32 PM
Subject: Re: mds port 6800 socket closed when accessing mount point
To: Sage Weil <sage@xxxxxxxxxxxx>

Hi Sage,

Thank you very much for your reply. In your question, the PID 1978 is
not a running process; it is the PID Ceph considers the mds0 process
is using. But the real PID of mds0 was not that one (seem to be 1979).
I redid the process and gathered the results. I think I see a little
bit of the problem now. When I'm starting Ceph, the program prints out
the PIDs of the daemons running on each machine, but these PIDs are
not the exact PIDs of the real running daemons.

I started the system and got the PID information:
mds0:~# /etc/init.d/ceph -a start
=== mon.0 ===
Starting Ceph mon0 on mds0...
 ** WARNING: Ceph is still under heavy development, and is only suitable for **
 **          testing and review.  Do not trust it with important data.       **
starting mon.0 rank 0 at 192.168.89.133:6789/0 mon_data /data/mon0
fsid 9ff93f53-5e63-011c-4437-8e3f780cfcde
=== mds.0 ===
Starting Ceph mds0 on mds0...
 ** WARNING: Ceph is still under heavy development, and is only suitable for **
 **          testing and review.  Do not trust it with important data.       **
starting mds.0 at 0.0.0.0:6800/1824
=== osd.0 ===
Starting Ceph osd0 on osd0...
 ** WARNING: Ceph is still under heavy development, and is only suitable for **
 **          testing and review.  Do not trust it with important data.       **
starting osd0 at 0.0.0.0:6800/1622 osd_data /data/osd0 /data/osd0/journal
=== osd.1 ===
Starting Ceph osd1 on osd1...
 ** WARNING: Ceph is still under heavy development, and is only suitable for **
 **          testing and review.  Do not trust it with important data.       **
starting osd1 at 0.0.0.0:6800/1616 osd_data /data/osd1 /data/osd1/journal

So from the print, the PIDs known by Ceph are:
moon.0       0
mds.0         1824
osd.0          1622
osd.1          1616

But I checked the PIDs by using "ps uax | grep ceph" on the 3
machines, I got this:
On mds0:
root      1792  1.0  1.2  46412  3260 ?        S<sl 20:14   0:07
/usr/bin/cmon -i 0 -c /etc/ceph/ceph.conf
root      1825 28.3  1.2  72228  3076 ?        S<sl 20:14   3:06
/usr/bin/cmds -i 0 -c /etc/ceph/ceph.conf
On osd0:
root      1623  0.9  8.3 214820 21356 ?        S<sl 20:07   0:01
/usr/bin/cosd -i 0 -c /tmp/ceph.conf.1747
On osd1:
root      1617  0.7  7.5 213780 19264 ?        S<sl 20:07   0:01
/usr/bin/cosd -i 1 -c /tmp/ceph.conf.1747
(Note the second fields are the PIDs)

Run "netstat -anp | grep 6800" on mds0, I got:
mds0:~# netstat -anp | grep 6800
tcp        0      0 0.0.0.0:6800            0.0.0.0:*
LISTEN      1825/cmds
tcp        0      0 192.168.89.133:33216    192.168.89.135:6800
ESTABLISHED 1825/cmds
tcp        0      0 192.168.89.133:56908    192.168.89.134:6800
ESTABLISHED 1825/cmds
tcp        0      0 192.168.89.133:6800     192.168.89.131:45334
ESTABLISHED 1825/cmds
(I can see PID 1825 is taking care of the client - mds0 communication)

At the end of file "mds.0.log", I saw the PID 1824:
2010-10-27 20:57:45.537898 b35e4b70 failed to decode message of type
784 v2: buffer::end_of_buffer
2010-10-27 20:57:45.538652 b35e4b70 -- 192.168.89.133:6800/1824 >>
192.168.89.131:0/753148386 pipe(0x9f43488 sd=-1 pgs=9 cs=7 l=0).fault
with nothing to send, going to standby
2010-10-27 20:58:18.044950 b30fdb70 -- 192.168.89.133:6800/1824 >>
192.168.89.131:0/753148386 pipe(0x9f7cca8 sd=10 pgs=0 cs=0 l=0).accept
peer addr is really 192.168.89.131:0/753148386 (socket is
192.168.89.131:45342/0)
2010-10-27 20:58:18.045247 b30fdb70 -- 192.168.89.133:6800/1824 >>
192.168.89.131:0/753148386 pipe(0x9f7cca8 sd=10 pgs=0 cs=0 l=0).accept
connect_seq 7 vs existing 7 state 3
2010-10-27 20:58:18.053628 b30fdb70 failed to decode message of type
784 v2: buffer::end_of_buffer
2010-10-27 20:58:18.054824 b30fdb70 -- 192.168.89.133:6800/1824 >>
192.168.89.131:0/753148386 pipe(0x9f7cca8 sd=-1 pgs=10 cs=8 l=0).fault
initiating reconnect

So this is strange. It seems the PIDs known by Ceph are not the real
PIDs of the processes. I don't know how this could happen.

The version of the Ceph code is:
mds0:~# ceph --version
ceph version 0.22.1 (commit:7464f9688001aa89f9673ba14e6d075d0ee33541)

Thank you very much!

On Wed, Oct 27, 2010 at 4:47 PM, Sage Weil <sage@xxxxxxxxxxxx> wrote:
> Hi,
>
> On Wed, 27 Oct 2010, Yonggang Liu wrote:
>> 2010-10-27 15:56:23.332473 b352eb70 failed to decode message of type
>> 784 v4865: buffer::end_of_buffer
>
> This is the problem.  Can you tell what 192.168.89.133:6800/1978 is?
> (That 1978 is a pid, btw.)  It's probably a 'cfuse' I'm guessing?  (The
> ceph daemons all start with 'c', but don't have 'ceph' in the name.)
>
> 784 is a client_caps message. The 'v4865' looks like an uninitialized
> variable to me, as it should be a small integer.
>
> Whatever daemon that is, can you run it with the '-v' argument to see
> exactly which version it is?
>
> Thanks!
> sage
>
>
>
>
>> 2010-10-27 15:56:23.332730 b352eb70 -- 192.168.89.133:6800/1978 >>
>> 192.168.89.131:0/1344076202 pipe(0x9fd0168 sd=-1 pgs=10 cs=8
>> l=0).fault with nothing to send, going to standby
>> 2010-10-27 15:56:52.022573 b332cb70 -- 192.168.89.133:6800/1978 >>
>> 192.168.89.131:0/1344076202 pipe(0x9fd1720 sd=12 pgs=0 cs=0
>> l=0).accept peer addr is really 192.168.89.131:0/1344076202 (socket is
>> 192.168.89.131:41726/0)
>> 2010-10-27 15:56:52.022945 b332cb70 -- 192.168.89.133:6800/1978 >>
>> 192.168.89.131:0/1344076202 pipe(0x9fd1720 sd=12 pgs=0 cs=0
>> l=0).accept connect_seq 8 vs existing 8 state 3
>> 2010-10-27 15:56:52.025299 b332cb70 failed to decode message of type
>> 784 v4865: buffer::end_of_buffer
>> 2010-10-27 15:56:52.025569 b332cb70 -- 192.168.89.133:6800/1978 >>
>> 192.168.89.131:0/1344076202 pipe(0x9fd1720 sd=-1 pgs=11 cs=9
>> l=0).fault with nothing to send, going to standby
>> 2010-10-27 15:57:48.106337 b362fb70 -- 192.168.89.133:6800/1978 >>
>> 192.168.89.131:0/1344076202 pipe(0x9fd07c0 sd=12 pgs=0 cs=0
>> l=0).accept peer addr is really 192.168.89.131:0/1344076202 (socket is
>> 192.168.89.131:41727/0)
>> 2010-10-27 15:57:48.106522 b362fb70 -- 192.168.89.133:6800/1978 >>
>> 192.168.89.131:0/1344076202 pipe(0x9fd07c0 sd=12 pgs=0 cs=0
>> l=0).accept connect_seq 9 vs existing 9 state 3
>> 2010-10-27 15:57:48.109498 b362fb70 failed to decode message of type
>> 784 v4865: buffer::end_of_buffer
>> 2010-10-27 15:57:48.109761 b362fb70 -- 192.168.89.133:6800/1978 >>
>> 192.168.89.131:0/1344076202 pipe(0x9fd07c0 sd=-1 pgs=12 cs=10
>> l=0).fault with nothing to send, going to standby
>>
>> On osd0
>> osd.0.log:
>> 2010-10-27 15:55:01.470502 --- 1633 opened log /var/log/ceph/osd.0.log ---
>> ceph version 0.22.1 (commit:7464f9688001aa89f9673ba14e6d075d0ee33541)
>> 2010-10-27 15:55:01.485828 b72d38e0 filestore(/data/osd0) mkfs in /data/osd0
>> 2010-10-27 15:55:01.486106 b72d38e0 filestore(/data/osd0) mkfs
>> removing old file fsid
>> 2010-10-27 15:55:01.516527 b72d38e0 filestore(/data/osd0) mkjournal
>> created journal on /journal
>> 2010-10-27 15:55:01.516734 b72d38e0 filestore(/data/osd0) mkfs done in
>> /data/osd0
>> 2010-10-27 15:55:01.519606 b72d38e0 filestore(/data/osd0) mount did
>> NOT detect btrfs
>> 2010-10-27 15:55:01.519794 b72d38e0 filestore(/data/osd0) mount found snaps <>
>> 2010-10-27 15:55:01.548307 b5acfb70 FileStore::op_tp worker finish
>> 2010-10-27 15:55:01.548455 b52ceb70 FileStore::op_tp worker finish
>> 2010-10-27 15:55:01.548812 b72d38e0 journal close /journal
>> 2010-10-27 15:55:25.201288 --- 1656 opened log /var/log/ceph/osd.0.log ---
>> ceph version 0.22.1 (commit:7464f9688001aa89f9673ba14e6d075d0ee33541)
>> 2010-10-27 15:55:25.223381 b74508e0 filestore(/data/osd0) mount did
>> NOT detect btrfs
>> 2010-10-27 15:55:25.224253 b74508e0 filestore(/data/osd0) mount found snaps <>
>> 2010-10-27 15:55:25.225396 b74508e0 journal read_entry 4096 : seq 1 203 bytes
>> 2010-10-27 15:55:26.841749 abffeb70 -- 0.0.0.0:6801/1656 >>
>> 192.168.89.135:6801/1675 pipe(0x94ced18 sd=13 pgs=0 cs=0 l=0).connect
>> claims to be 0.0.0.0:6801/1675 not 192.168.89.135:6801/1675 -
>> presumably this is the same node!
>> 2010-10-27 15:56:04.219613 abaf9b70 -- 192.168.89.134:6800/1656 >>
>> 192.168.89.131:0/1344076202 pipe(0x96463e0 sd=16 pgs=0 cs=0
>> l=0).accept peer addr is really 192.168.89.131:0/1344076202 (socket is
>> 192.168.89.131:36746/0)
>>
>> On osd1
>> osd.1.log:
>> 2010-10-27 15:54:59.752615 --- 1652 opened log /var/log/ceph/osd.1.log ---
>> ceph version 0.22.1 (commit:7464f9688001aa89f9673ba14e6d075d0ee33541)
>> 2010-10-27 15:54:59.766128 b73518e0 filestore(/data/osd1) mkfs in /data/osd1
>> 2010-10-27 15:54:59.766658 b73518e0 filestore(/data/osd1) mkfs
>> removing old file fsid
>> 2010-10-27 15:54:59.796938 b73518e0 filestore(/data/osd1) mkjournal
>> created journal on /journal
>> 2010-10-27 15:54:59.797816 b73518e0 filestore(/data/osd1) mkfs done in
>> /data/osd1
>> 2010-10-27 15:54:59.800957 b73518e0 filestore(/data/osd1) mount did
>> NOT detect btrfs
>> 2010-10-27 15:54:59.801087 b73518e0 filestore(/data/osd1) mount found snaps <>
>> 2010-10-27 15:54:59.832202 b534cb70 FileStore::op_tp worker finish
>> 2010-10-27 15:54:59.832504 b5b4db70 FileStore::op_tp worker finish
>> 2010-10-27 15:54:59.832723 b73518e0 journal close /journal
>> 2010-10-27 15:55:23.050042 --- 1675 opened log /var/log/ceph/osd.1.log ---
>> ceph version 0.22.1 (commit:7464f9688001aa89f9673ba14e6d075d0ee33541)
>> 2010-10-27 15:55:23.056671 b72e18e0 filestore(/data/osd1) mount did
>> NOT detect btrfs
>> 2010-10-27 15:55:23.056921 b72e18e0 filestore(/data/osd1) mount found snaps <>
>> 2010-10-27 15:55:23.057368 b72e18e0 journal read_entry 4096 : seq 1 203 bytes
>> 2010-10-27 15:55:23.207216 b12d4b70 osd1 2 map says i am down or have
>> a different address.  switching to boot state.
>> 2010-10-27 15:55:23.207540 b12d4b70 log [WRN] : map e2 wrongly marked me down
>>
>> I noticed on osd1, the last two lines of osd.1.log are unusual, but
>> I'm not sure if they are the reason to the problem ...
>>
>> Thank you very much,
>>
>>
>> On Wed, Oct 27, 2010 at 1:10 PM, Colin McCabe <cmccabe@xxxxxxxxxxxxxx> wrote:
>> > Hi Yonggang,
>> >
>> > Are all of the daemons still running? What is at the end of the logfiles?
>> >
>> > regards,
>> > Colin
>> >
>> >
>> > On Wed, Oct 27, 2010 at 9:42 AM, Yonggang Liu <myidpt@xxxxxxxxx> wrote:
>> >> Hello,
>> >>
>> >> I'm totally new to Ceph. Last a few days I set up 4 VMs to run Ceph:
>> >> "mds0" for the metadata server and monitor, "osd0" and "osd1" for two
>> >> data servers, and "client" for the client machine. The VMs are running
>> >> Debian 5.0 with kernel 2.6.32-5-686 (Ceph module is enabled).
>> >> I followed "Building kernel client" and "Debian" from the wiki, and I
>> >> was able to start Ceph and mount Ceph at the client. But the problem
>> >> is, the mounted point always fail with an infinite response time
>> >> (after I mount Ceph for about 1 min or less). To illustrate it better,
>> >> I will show you the information I got on the client and mds0 machines:
>> >>
>> >> mds0 (192.168.89.133):
>> >> debian:~# mkcephfs -c /etc/ceph/ceph.conf --allhosts -v
>> >> (A lot of info)
>> >> debian:~# /etc/init.d/ceph -a start
>> >> (some info)
>> >>
>> >> client (192.168.89.131):
>> >> debian:~# mount -t ceph 192.168.89.133:/ /ceph
>> >> debian:~# cd /ceph
>> >> debian:/ceph# cp ~/app_ch.xls .
>> >> debian:/ceph# ls
>> >> (waiting for ever)
>> >> ^C
>> >>
>> >> After the failure I ran dmesg at the client side and got:
>> >> client (192.168.89.131):
>> >> debian:/ceph# dmesg -c
>> >> [  636.664425] ceph: loaded (mon/mds/osd proto 15/32/24, osdmap 5/5 5/5)
>> >> [  636.694973] ceph: client4100 fsid 423ad64c-bbf0-3011-bb47-36a89f8787c6
>> >> [  636.700716] ceph: mon0 192.168.89.133:6789 session established
>> >> [  664.114551] ceph: mds0 192.168.89.133:6800 socket closed
>> >> [  664.848722] ceph: mds0 192.168.89.133:6800 socket closed
>> >> [  665.914923] ceph: mds0 192.168.89.133:6800 socket closed
>> >> [  667.840396] ceph: mds0 192.168.89.133:6800 socket closed
>> >> [  672.054106] ceph: mds0 192.168.89.133:6800 socket closed
>> >> [  680.894531] ceph: mds0 192.168.89.133:6800 socket closed
>> >> [  696.928496] ceph: mds0 192.168.89.133:6800 socket closed
>> >> [  720.171754] ceph: mds0 caps stale
>> >> [  728.999701] ceph: mds0 192.168.89.133:6800 socket closed
>> >> [  794.640943] ceph: mds0 192.168.89.133:6800 socket closed
>> >>
>> >> Immediately after the failure, I ran netstat at mds0:
>> >> mds0 (192.168.89.133):
>> >> debian:~# netstat -anp
>> >> Active Internet connections (servers and established)
>> >> Proto Recv-Q Send-Q Local Address           Foreign Address
>> >> State       PID/Program name
>> >> tcp        0      0 0.0.0.0:6800            0.0.0.0:*
>> >> LISTEN      1889/cmds
>> >> tcp        0      0 0.0.0.0:22              0.0.0.0:*
>> >> LISTEN      1529/sshd
>> >> tcp        0      0 192.168.89.133:6789     0.0.0.0:*
>> >> LISTEN      1840/cmon
>> >> tcp        0      0 192.168.89.133:6789     192.168.89.131:56855
>> >> ESTABLISHED 1840/cmon
>> >> tcp        0      0 192.168.89.133:43647    192.168.89.133:6789
>> >> ESTABLISHED 1889/cmds
>> >> tcp        0      0 192.168.89.133:22       192.168.89.1:58304
>> >> ESTABLISHED 1530/0
>> >> tcp        0      0 192.168.89.133:39826    192.168.89.134:6800
>> >> ESTABLISHED 1889/cmds
>> >> tcp        0      0 192.168.89.133:6789     192.168.89.134:41289
>> >> ESTABLISHED 1840/cmon
>> >> tcp        0      0 192.168.89.133:6800     192.168.89.131:52814
>> >> TIME_WAIT   -
>> >> tcp        0      0 192.168.89.133:6789     192.168.89.135:41021
>> >> ESTABLISHED 1840/cmon
>> >> tcp        0      0 192.168.89.133:42069    192.168.89.135:6800
>> >> ESTABLISHED 1889/cmds
>> >> tcp        0      0 192.168.89.133:6789     192.168.89.133:43647
>> >> ESTABLISHED 1840/cmon
>> >> tcp        0      0 192.168.89.133:6800     192.168.89.131:52815
>> >> TIME_WAIT   -
>> >> tcp        0      0 192.168.89.133:6800     192.168.89.131:52816
>> >> TIME_WAIT   -
>> >> tcp6       0      0 :::22                   :::*
>> >> LISTEN      1529/sshd
>> >> udp        0      0 0.0.0.0:68              0.0.0.0:*
>> >>         1490/dhclient3
>> >> Active UNIX domain sockets (servers and established)
>> >> Proto RefCnt Flags       Type       State         I-Node   PID/Program
>> >> name    Path
>> >> unix  2      [ ]         DGRAM                    2972     546/udevd
>> >>        @/org/kernel/udev/udevd
>> >> unix  4      [ ]         DGRAM                    5343
>> >> 1358/rsyslogd       /dev/log
>> >> unix  2      [ ]         DGRAM                    5662     1530/0
>> >> unix  2      [ ]         DGRAM                    5486     1490/dhclient3
>> >> debian:~#
>> >> debian:~# dmesg -c
>> >> debian:~# (nothing shows up)
>> >>
>> >> I saw the port 6800 on the metadata server talking with the client is
>> >> on "TIME_WAIT" stage. That means the connection is closed.
>> >> This is the ceph.conf I have:
>> >> [global]
>> >>       pid file = /var/run/ceph/$type.$id.pid
>> >> [mon]
>> >>       mon data = /data/mon$id
>> >>       mon subscribe interval = 6000
>> >>       mon osd down out interval = 6000
>> >> [mon0]
>> >>       host = mds0
>> >>       mon addr = 192.168.89.133:6789
>> >> [mds]
>> >>       mds session timeout = 6000
>> >>       mds session autoclose = 6000
>> >>       mds client lease = 6000
>> >>       keyring = /data/keyring.$name
>> >> [mds0]
>> >>       host = mds0
>> >> [osd]
>> >>       sudo = true
>> >>       osd data = /data/osd$id
>> >>       osd journal = /journal
>> >>       osd journal size = 1024
>> >>       filestore journal writeahead = true
>> >> [osd0]
>> >>       host = osd0
>> >> [osd1]
>> >>       host = osd1
>> >> [group everyone]
>> >>       addr = 0.0.0.0/0
>> >> [mount]
>> >>       allow = %everyone
>> >> ;-----------------------------------end-----------------------------------
>> >>
>> >> The Ceph version I was using is 0.22.1.
>> >>
>> >> Can anyone help me to solve this problem? Thanks in advance!
>> >> --
>> >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> >> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> >>
>> >
>>
>>
>>
>> --
>> Yonggang Liu
>> Advanced Computing and Information Systems Laboratory
>> University of Florida
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>

--
Yonggang Liu
Advanced Computing and Information Systems Laboratory
University of Florida

-- 
Yonggang Liu
Advanced Computing and Information Systems Laboratory
University of Florida
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html