Re: Fwd: mds port 6800 socket closed when accessing mount point

Yonggang Liu <myidpt@xxxxxxxxx> · Thu, 28 Oct 2010 12:57:56 -0400

Thank you Sage. I solved the problem:
I used the git code branch rbd-backport, and it did not give me any
error when I run "insmod ceph.ko". Now I can mount ceph without any
failure. I think that was a kernel compatibility problem.

Thank you,

On Thu, Oct 28, 2010 at 1:07 AM, Yonggang Liu <myidpt@xxxxxxxxx> wrote:
> Hi Sage,
>
> I think I followed this page http://ceph.newdream.net/wiki/Debian to
> install the kernel client module (I didn't use the git client
> stand-alone code).
> I updated the aptitude repo with:
> deb http://ceph.newdream.net/debian/ lenny ceph-stable
> deb-src http://ceph.newdream.net/debian/ lenny ceph-stable
> I upgraded my kernel:
> apt-get install linux-image-2.6.32-686
> And then I followed the instructions:
> apt-get install ceph
> apt-get install linux-headers-2.6.32-686
> cd /usr/src/modules/ceph
> make
> make modules_install
> depmod
> modprobe ceph
> (That's what I did. Maybe the package I got is not suitable for this kernel?)
>
>
> I also tried the git approach here
> http://ceph.newdream.net/wiki/Building_kernel_client, but I met a
> problem when I was running "insmod ceph.ko" to load the kernel. Here's
> the error and the dmesg.
> client:~/ceph-client-standalone# insmod ceph.ko
> insmod: error inserting 'ceph.ko': -1 Unknown symbol in module
> client:~/ceph-client-standalone# dmesg -c
> [ 4295.877555] ceph: Unknown symbol task_dirty_inc
>
> The git commit I was on is: 0184d86d147911efe080dcbbfba40f0b3617659f
>
> Any suggestions? Thank you very much!
>
>
>
>
>
> On Wed, Oct 27, 2010 at 11:53 PM, Sage Weil <sage@xxxxxxxxxxxx> wrote:
>> On Wed, 27 Oct 2010, Yonggang Liu wrote:
>>> > In any case, the source of the offending message is
>>> >
>>> > 192.168.89.131:0/753148386
>>> >
>>> > The large number at the end is just random.  I'm guessing that's the
>>> > kernel client mounting/trying to mount the file system.  What kernel
>>> > version, or version of the kernel module are you using?
>>> >
>>> Here's the kernel version of all my VMs:
>>> mds0:~# uname -a
>>> Linux mds0 2.6.32-5-686 #1 SMP Tue Oct 19 14:40:34 UTC 2010 i686 GNU/Linux
>>>
>>> The mount command I was using is:
>>> mount -t ceph mds0:/ /ceph
>>> Do you mean the failure is due to the client kernel?
>>
>> Right.  Where is the ceph kernel module coming from?  Are you building
>> from ceph-client-standalone.git, or some other source?  If it's from
>> -standalone.git, what commit are you on? (git rev-parse HEAD)
>>
>> sage
>>
>>
>>
>>
>>>
>>> > Thanks-
>>> > sage
>>> >
>>>
>>> Thank you!
>>>
>>> >
>>> > On Wed, 27 Oct 2010, Yonggang Liu wrote:
>>> >
>>> >> ---------- Forwarded message ----------
>>> >> From: Yonggang Liu <myidpt@xxxxxxxxx>
>>> >> Date: Wed, Oct 27, 2010 at 7:32 PM
>>> >> Subject: Re: mds port 6800 socket closed when accessing mount point
>>> >> To: Sage Weil <sage@xxxxxxxxxxxx>
>>> >>
>>> >>
>>> >> Hi Sage,
>>> >>
>>> >> Thank you very much for your reply. In your question, the PID 1978 is
>>> >> not a running process; it is the PID Ceph considers the mds0 process
>>> >> is using. But the real PID of mds0 was not that one (seem to be 1979).
>>> >> I redid the process and gathered the results. I think I see a little
>>> >> bit of the problem now. When I'm starting Ceph, the program prints out
>>> >> the PIDs of the daemons running on each machine, but these PIDs are
>>> >> not the exact PIDs of the real running daemons.
>>> >>
>>> >> I started the system and got the PID information:
>>> >> mds0:~# /etc/init.d/ceph -a start
>>> >> === mon.0 ===
>>> >> Starting Ceph mon0 on mds0...
>>> >>  ** WARNING: Ceph is still under heavy development, and is only suitable for **
>>> >>  **          testing and review.  Do not trust it with important data.       **
>>> >> starting mon.0 rank 0 at 192.168.89.133:6789/0 mon_data /data/mon0
>>> >> fsid 9ff93f53-5e63-011c-4437-8e3f780cfcde
>>> >> === mds.0 ===
>>> >> Starting Ceph mds0 on mds0...
>>> >>  ** WARNING: Ceph is still under heavy development, and is only suitable for **
>>> >>  **          testing and review.  Do not trust it with important data.       **
>>> >> starting mds.0 at 0.0.0.0:6800/1824
>>> >> === osd.0 ===
>>> >> Starting Ceph osd0 on osd0...
>>> >>  ** WARNING: Ceph is still under heavy development, and is only suitable for **
>>> >>  **          testing and review.  Do not trust it with important data.       **
>>> >> starting osd0 at 0.0.0.0:6800/1622 osd_data /data/osd0 /data/osd0/journal
>>> >> === osd.1 ===
>>> >> Starting Ceph osd1 on osd1...
>>> >>  ** WARNING: Ceph is still under heavy development, and is only suitable for **
>>> >>  **          testing and review.  Do not trust it with important data.       **
>>> >> starting osd1 at 0.0.0.0:6800/1616 osd_data /data/osd1 /data/osd1/journal
>>> >>
>>> >> So from the print, the PIDs known by Ceph are:
>>> >> moon.0       0
>>> >> mds.0         1824
>>> >> osd.0          1622
>>> >> osd.1          1616
>>> >>
>>> >> But I checked the PIDs by using "ps uax | grep ceph" on the 3
>>> >> machines, I got this:
>>> >> On mds0:
>>> >> root      1792  1.0  1.2  46412  3260 ?        S<sl 20:14   0:07
>>> >> /usr/bin/cmon -i 0 -c /etc/ceph/ceph.conf
>>> >> root      1825 28.3  1.2  72228  3076 ?        S<sl 20:14   3:06
>>> >> /usr/bin/cmds -i 0 -c /etc/ceph/ceph.conf
>>> >> On osd0:
>>> >> root      1623  0.9  8.3 214820 21356 ?        S<sl 20:07   0:01
>>> >> /usr/bin/cosd -i 0 -c /tmp/ceph.conf.1747
>>> >> On osd1:
>>> >> root      1617  0.7  7.5 213780 19264 ?        S<sl 20:07   0:01
>>> >> /usr/bin/cosd -i 1 -c /tmp/ceph.conf.1747
>>> >> (Note the second fields are the PIDs)
>>> >>
>>> >> Run "netstat -anp | grep 6800" on mds0, I got:
>>> >> mds0:~# netstat -anp | grep 6800
>>> >> tcp        0      0 0.0.0.0:6800            0.0.0.0:*
>>> >> LISTEN      1825/cmds
>>> >> tcp        0      0 192.168.89.133:33216    192.168.89.135:6800
>>> >> ESTABLISHED 1825/cmds
>>> >> tcp        0      0 192.168.89.133:56908    192.168.89.134:6800
>>> >> ESTABLISHED 1825/cmds
>>> >> tcp        0      0 192.168.89.133:6800     192.168.89.131:45334
>>> >> ESTABLISHED 1825/cmds
>>> >> (I can see PID 1825 is taking care of the client - mds0 communication)
>>> >>
>>> >> At the end of file "mds.0.log", I saw the PID 1824:
>>> >> 2010-10-27 20:57:45.537898 b35e4b70 failed to decode message of type
>>> >> 784 v2: buffer::end_of_buffer
>>> >> 2010-10-27 20:57:45.538652 b35e4b70 -- 192.168.89.133:6800/1824 >>
>>> >> 192.168.89.131:0/753148386 pipe(0x9f43488 sd=-1 pgs=9 cs=7 l=0).fault
>>> >> with nothing to send, going to standby
>>> >> 2010-10-27 20:58:18.044950 b30fdb70 -- 192.168.89.133:6800/1824 >>
>>> >> 192.168.89.131:0/753148386 pipe(0x9f7cca8 sd=10 pgs=0 cs=0 l=0).accept
>>> >> peer addr is really 192.168.89.131:0/753148386 (socket is
>>> >> 192.168.89.131:45342/0)
>>> >> 2010-10-27 20:58:18.045247 b30fdb70 -- 192.168.89.133:6800/1824 >>
>>> >> 192.168.89.131:0/753148386 pipe(0x9f7cca8 sd=10 pgs=0 cs=0 l=0).accept
>>> >> connect_seq 7 vs existing 7 state 3
>>> >> 2010-10-27 20:58:18.053628 b30fdb70 failed to decode message of type
>>> >> 784 v2: buffer::end_of_buffer
>>> >> 2010-10-27 20:58:18.054824 b30fdb70 -- 192.168.89.133:6800/1824 >>
>>> >> 192.168.89.131:0/753148386 pipe(0x9f7cca8 sd=-1 pgs=10 cs=8 l=0).fault
>>> >> initiating reconnect
>>> >>
>>> >> So this is strange. It seems the PIDs known by Ceph are not the real
>>> >> PIDs of the processes. I don't know how this could happen.
>>> >>
>>> >> The version of the Ceph code is:
>>> >> mds0:~# ceph --version
>>> >> ceph version 0.22.1 (commit:7464f9688001aa89f9673ba14e6d075d0ee33541)
>>> >>
>>> >> Thank you very much!
>>> >>
>>> >>
>>> >> On Wed, Oct 27, 2010 at 4:47 PM, Sage Weil <sage@xxxxxxxxxxxx> wrote:
>>> >> > Hi,
>>> >> >
>>> >> > On Wed, 27 Oct 2010, Yonggang Liu wrote:
>>> >> >> 2010-10-27 15:56:23.332473 b352eb70 failed to decode message of type
>>> >> >> 784 v4865: buffer::end_of_buffer
>>> >> >
>>> >> > This is the problem.  Can you tell what 192.168.89.133:6800/1978 is?
>>> >> > (That 1978 is a pid, btw.)  It's probably a 'cfuse' I'm guessing?  (The
>>> >> > ceph daemons all start with 'c', but don't have 'ceph' in the name.)
>>> >> >
>>> >> > 784 is a client_caps message. The 'v4865' looks like an uninitialized
>>> >> > variable to me, as it should be a small integer.
>>> >> >
>>> >> > Whatever daemon that is, can you run it with the '-v' argument to see
>>> >> > exactly which version it is?
>>> >> >
>>> >> > Thanks!
>>> >> > sage
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> >> 2010-10-27 15:56:23.332730 b352eb70 -- 192.168.89.133:6800/1978 >>
>>> >> >> 192.168.89.131:0/1344076202 pipe(0x9fd0168 sd=-1 pgs=10 cs=8
>>> >> >> l=0).fault with nothing to send, going to standby
>>> >> >> 2010-10-27 15:56:52.022573 b332cb70 -- 192.168.89.133:6800/1978 >>
>>> >> >> 192.168.89.131:0/1344076202 pipe(0x9fd1720 sd=12 pgs=0 cs=0
>>> >> >> l=0).accept peer addr is really 192.168.89.131:0/1344076202 (socket is
>>> >> >> 192.168.89.131:41726/0)
>>> >> >> 2010-10-27 15:56:52.022945 b332cb70 -- 192.168.89.133:6800/1978 >>
>>> >> >> 192.168.89.131:0/1344076202 pipe(0x9fd1720 sd=12 pgs=0 cs=0
>>> >> >> l=0).accept connect_seq 8 vs existing 8 state 3
>>> >> >> 2010-10-27 15:56:52.025299 b332cb70 failed to decode message of type
>>> >> >> 784 v4865: buffer::end_of_buffer
>>> >> >> 2010-10-27 15:56:52.025569 b332cb70 -- 192.168.89.133:6800/1978 >>
>>> >> >> 192.168.89.131:0/1344076202 pipe(0x9fd1720 sd=-1 pgs=11 cs=9
>>> >> >> l=0).fault with nothing to send, going to standby
>>> >> >> 2010-10-27 15:57:48.106337 b362fb70 -- 192.168.89.133:6800/1978 >>
>>> >> >> 192.168.89.131:0/1344076202 pipe(0x9fd07c0 sd=12 pgs=0 cs=0
>>> >> >> l=0).accept peer addr is really 192.168.89.131:0/1344076202 (socket is
>>> >> >> 192.168.89.131:41727/0)
>>> >> >> 2010-10-27 15:57:48.106522 b362fb70 -- 192.168.89.133:6800/1978 >>
>>> >> >> 192.168.89.131:0/1344076202 pipe(0x9fd07c0 sd=12 pgs=0 cs=0
>>> >> >> l=0).accept connect_seq 9 vs existing 9 state 3
>>> >> >> 2010-10-27 15:57:48.109498 b362fb70 failed to decode message of type
>>> >> >> 784 v4865: buffer::end_of_buffer
>>> >> >> 2010-10-27 15:57:48.109761 b362fb70 -- 192.168.89.133:6800/1978 >>
>>> >> >> 192.168.89.131:0/1344076202 pipe(0x9fd07c0 sd=-1 pgs=12 cs=10
>>> >> >> l=0).fault with nothing to send, going to standby
>>> >> >>
>>> >> >> On osd0
>>> >> >> osd.0.log:
>>> >> >> 2010-10-27 15:55:01.470502 --- 1633 opened log /var/log/ceph/osd.0.log ---
>>> >> >> ceph version 0.22.1 (commit:7464f9688001aa89f9673ba14e6d075d0ee33541)
>>> >> >> 2010-10-27 15:55:01.485828 b72d38e0 filestore(/data/osd0) mkfs in /data/osd0
>>> >> >> 2010-10-27 15:55:01.486106 b72d38e0 filestore(/data/osd0) mkfs
>>> >> >> removing old file fsid
>>> >> >> 2010-10-27 15:55:01.516527 b72d38e0 filestore(/data/osd0) mkjournal
>>> >> >> created journal on /journal
>>> >> >> 2010-10-27 15:55:01.516734 b72d38e0 filestore(/data/osd0) mkfs done in
>>> >> >> /data/osd0
>>> >> >> 2010-10-27 15:55:01.519606 b72d38e0 filestore(/data/osd0) mount did
>>> >> >> NOT detect btrfs
>>> >> >> 2010-10-27 15:55:01.519794 b72d38e0 filestore(/data/osd0) mount found snaps <>
>>> >> >> 2010-10-27 15:55:01.548307 b5acfb70 FileStore::op_tp worker finish
>>> >> >> 2010-10-27 15:55:01.548455 b52ceb70 FileStore::op_tp worker finish
>>> >> >> 2010-10-27 15:55:01.548812 b72d38e0 journal close /journal
>>> >> >> 2010-10-27 15:55:25.201288 --- 1656 opened log /var/log/ceph/osd.0.log ---
>>> >> >> ceph version 0.22.1 (commit:7464f9688001aa89f9673ba14e6d075d0ee33541)
>>> >> >> 2010-10-27 15:55:25.223381 b74508e0 filestore(/data/osd0) mount did
>>> >> >> NOT detect btrfs
>>> >> >> 2010-10-27 15:55:25.224253 b74508e0 filestore(/data/osd0) mount found snaps <>
>>> >> >> 2010-10-27 15:55:25.225396 b74508e0 journal read_entry 4096 : seq 1 203 bytes
>>> >> >> 2010-10-27 15:55:26.841749 abffeb70 -- 0.0.0.0:6801/1656 >>
>>> >> >> 192.168.89.135:6801/1675 pipe(0x94ced18 sd=13 pgs=0 cs=0 l=0).connect
>>> >> >> claims to be 0.0.0.0:6801/1675 not 192.168.89.135:6801/1675 -
>>> >> >> presumably this is the same node!
>>> >> >> 2010-10-27 15:56:04.219613 abaf9b70 -- 192.168.89.134:6800/1656 >>
>>> >> >> 192.168.89.131:0/1344076202 pipe(0x96463e0 sd=16 pgs=0 cs=0
>>> >> >> l=0).accept peer addr is really 192.168.89.131:0/1344076202 (socket is
>>> >> >> 192.168.89.131:36746/0)
>>> >> >>
>>> >> >> On osd1
>>> >> >> osd.1.log:
>>> >> >> 2010-10-27 15:54:59.752615 --- 1652 opened log /var/log/ceph/osd.1.log ---
>>> >> >> ceph version 0.22.1 (commit:7464f9688001aa89f9673ba14e6d075d0ee33541)
>>> >> >> 2010-10-27 15:54:59.766128 b73518e0 filestore(/data/osd1) mkfs in /data/osd1
>>> >> >> 2010-10-27 15:54:59.766658 b73518e0 filestore(/data/osd1) mkfs
>>> >> >> removing old file fsid
>>> >> >> 2010-10-27 15:54:59.796938 b73518e0 filestore(/data/osd1) mkjournal
>>> >> >> created journal on /journal
>>> >> >> 2010-10-27 15:54:59.797816 b73518e0 filestore(/data/osd1) mkfs done in
>>> >> >> /data/osd1
>>> >> >> 2010-10-27 15:54:59.800957 b73518e0 filestore(/data/osd1) mount did
>>> >> >> NOT detect btrfs
>>> >> >> 2010-10-27 15:54:59.801087 b73518e0 filestore(/data/osd1) mount found snaps <>
>>> >> >> 2010-10-27 15:54:59.832202 b534cb70 FileStore::op_tp worker finish
>>> >> >> 2010-10-27 15:54:59.832504 b5b4db70 FileStore::op_tp worker finish
>>> >> >> 2010-10-27 15:54:59.832723 b73518e0 journal close /journal
>>> >> >> 2010-10-27 15:55:23.050042 --- 1675 opened log /var/log/ceph/osd.1.log ---
>>> >> >> ceph version 0.22.1 (commit:7464f9688001aa89f9673ba14e6d075d0ee33541)
>>> >> >> 2010-10-27 15:55:23.056671 b72e18e0 filestore(/data/osd1) mount did
>>> >> >> NOT detect btrfs
>>> >> >> 2010-10-27 15:55:23.056921 b72e18e0 filestore(/data/osd1) mount found snaps <>
>>> >> >> 2010-10-27 15:55:23.057368 b72e18e0 journal read_entry 4096 : seq 1 203 bytes
>>> >> >> 2010-10-27 15:55:23.207216 b12d4b70 osd1 2 map says i am down or have
>>> >> >> a different address.  switching to boot state.
>>> >> >> 2010-10-27 15:55:23.207540 b12d4b70 log [WRN] : map e2 wrongly marked me down
>>> >> >>
>>> >> >> I noticed on osd1, the last two lines of osd.1.log are unusual, but
>>> >> >> I'm not sure if they are the reason to the problem ...
>>> >> >>
>>> >> >> Thank you very much,
>>> >> >>
>>> >> >>
>>> >> >> On Wed, Oct 27, 2010 at 1:10 PM, Colin McCabe <cmccabe@xxxxxxxxxxxxxx> wrote:
>>> >> >> > Hi Yonggang,
>>> >> >> >
>>> >> >> > Are all of the daemons still running? What is at the end of the logfiles?
>>> >> >> >
>>> >> >> > regards,
>>> >> >> > Colin
>>> >> >> >
>>> >> >> >
>>> >> >> > On Wed, Oct 27, 2010 at 9:42 AM, Yonggang Liu <myidpt@xxxxxxxxx> wrote:
>>> >> >> >> Hello,
>>> >> >> >>
>>> >> >> >> I'm totally new to Ceph. Last a few days I set up 4 VMs to run Ceph:
>>> >> >> >> "mds0" for the metadata server and monitor, "osd0" and "osd1" for two
>>> >> >> >> data servers, and "client" for the client machine. The VMs are running
>>> >> >> >> Debian 5.0 with kernel 2.6.32-5-686 (Ceph module is enabled).
>>> >> >> >> I followed "Building kernel client" and "Debian" from the wiki, and I
>>> >> >> >> was able to start Ceph and mount Ceph at the client. But the problem
>>> >> >> >> is, the mounted point always fail with an infinite response time
>>> >> >> >> (after I mount Ceph for about 1 min or less). To illustrate it better,
>>> >> >> >> I will show you the information I got on the client and mds0 machines:
>>> >> >> >>
>>> >> >> >> mds0 (192.168.89.133):
>>> >> >> >> debian:~# mkcephfs -c /etc/ceph/ceph.conf --allhosts -v
>>> >> >> >> (A lot of info)
>>> >> >> >> debian:~# /etc/init.d/ceph -a start
>>> >> >> >> (some info)
>>> >> >> >>
>>> >> >> >> client (192.168.89.131):
>>> >> >> >> debian:~# mount -t ceph 192.168.89.133:/ /ceph
>>> >> >> >> debian:~# cd /ceph
>>> >> >> >> debian:/ceph# cp ~/app_ch.xls .
>>> >> >> >> debian:/ceph# ls
>>> >> >> >> (waiting for ever)
>>> >> >> >> ^C
>>> >> >> >>
>>> >> >> >> After the failure I ran dmesg at the client side and got:
>>> >> >> >> client (192.168.89.131):
>>> >> >> >> debian:/ceph# dmesg -c
>>> >> >> >> [  636.664425] ceph: loaded (mon/mds/osd proto 15/32/24, osdmap 5/5 5/5)
>>> >> >> >> [  636.694973] ceph: client4100 fsid 423ad64c-bbf0-3011-bb47-36a89f8787c6
>>> >> >> >> [  636.700716] ceph: mon0 192.168.89.133:6789 session established
>>> >> >> >> [  664.114551] ceph: mds0 192.168.89.133:6800 socket closed
>>> >> >> >> [  664.848722] ceph: mds0 192.168.89.133:6800 socket closed
>>> >> >> >> [  665.914923] ceph: mds0 192.168.89.133:6800 socket closed
>>> >> >> >> [  667.840396] ceph: mds0 192.168.89.133:6800 socket closed
>>> >> >> >> [  672.054106] ceph: mds0 192.168.89.133:6800 socket closed
>>> >> >> >> [  680.894531] ceph: mds0 192.168.89.133:6800 socket closed
>>> >> >> >> [  696.928496] ceph: mds0 192.168.89.133:6800 socket closed
>>> >> >> >> [  720.171754] ceph: mds0 caps stale
>>> >> >> >> [  728.999701] ceph: mds0 192.168.89.133:6800 socket closed
>>> >> >> >> [  794.640943] ceph: mds0 192.168.89.133:6800 socket closed
>>> >> >> >>
>>> >> >> >> Immediately after the failure, I ran netstat at mds0:
>>> >> >> >> mds0 (192.168.89.133):
>>> >> >> >> debian:~# netstat -anp
>>> >> >> >> Active Internet connections (servers and established)
>>> >> >> >> Proto Recv-Q Send-Q Local Address           Foreign Address
>>> >> >> >> State       PID/Program name
>>> >> >> >> tcp        0      0 0.0.0.0:6800            0.0.0.0:*
>>> >> >> >> LISTEN      1889/cmds
>>> >> >> >> tcp        0      0 0.0.0.0:22              0.0.0.0:*
>>> >> >> >> LISTEN      1529/sshd
>>> >> >> >> tcp        0      0 192.168.89.133:6789     0.0.0.0:*
>>> >> >> >> LISTEN      1840/cmon
>>> >> >> >> tcp        0      0 192.168.89.133:6789     192.168.89.131:56855
>>> >> >> >> ESTABLISHED 1840/cmon
>>> >> >> >> tcp        0      0 192.168.89.133:43647    192.168.89.133:6789
>>> >> >> >> ESTABLISHED 1889/cmds
>>> >> >> >> tcp        0      0 192.168.89.133:22       192.168.89.1:58304
>>> >> >> >> ESTABLISHED 1530/0
>>> >> >> >> tcp        0      0 192.168.89.133:39826    192.168.89.134:6800
>>> >> >> >> ESTABLISHED 1889/cmds
>>> >> >> >> tcp        0      0 192.168.89.133:6789     192.168.89.134:41289
>>> >> >> >> ESTABLISHED 1840/cmon
>>> >> >> >> tcp        0      0 192.168.89.133:6800     192.168.89.131:52814
>>> >> >> >> TIME_WAIT   -
>>> >> >> >> tcp        0      0 192.168.89.133:6789     192.168.89.135:41021
>>> >> >> >> ESTABLISHED 1840/cmon
>>> >> >> >> tcp        0      0 192.168.89.133:42069    192.168.89.135:6800
>>> >> >> >> ESTABLISHED 1889/cmds
>>> >> >> >> tcp        0      0 192.168.89.133:6789     192.168.89.133:43647
>>> >> >> >> ESTABLISHED 1840/cmon
>>> >> >> >> tcp        0      0 192.168.89.133:6800     192.168.89.131:52815
>>> >> >> >> TIME_WAIT   -
>>> >> >> >> tcp        0      0 192.168.89.133:6800     192.168.89.131:52816
>>> >> >> >> TIME_WAIT   -
>>> >> >> >> tcp6       0      0 :::22                   :::*
>>> >> >> >> LISTEN      1529/sshd
>>> >> >> >> udp        0      0 0.0.0.0:68              0.0.0.0:*
>>> >> >> >>         1490/dhclient3
>>> >> >> >> Active UNIX domain sockets (servers and established)
>>> >> >> >> Proto RefCnt Flags       Type       State         I-Node   PID/Program
>>> >> >> >> name    Path
>>> >> >> >> unix  2      [ ]         DGRAM                    2972     546/udevd
>>> >> >> >>        @/org/kernel/udev/udevd
>>> >> >> >> unix  4      [ ]         DGRAM                    5343
>>> >> >> >> 1358/rsyslogd       /dev/log
>>> >> >> >> unix  2      [ ]         DGRAM                    5662     1530/0
>>> >> >> >> unix  2      [ ]         DGRAM                    5486     1490/dhclient3
>>> >> >> >> debian:~#
>>> >> >> >> debian:~# dmesg -c
>>> >> >> >> debian:~# (nothing shows up)
>>> >> >> >>
>>> >> >> >> I saw the port 6800 on the metadata server talking with the client is
>>> >> >> >> on "TIME_WAIT" stage. That means the connection is closed.
>>> >> >> >> This is the ceph.conf I have:
>>> >> >> >> [global]
>>> >> >> >>       pid file = /var/run/ceph/$type.$id.pid
>>> >> >> >> [mon]
>>> >> >> >>       mon data = /data/mon$id
>>> >> >> >>       mon subscribe interval = 6000
>>> >> >> >>       mon osd down out interval = 6000
>>> >> >> >> [mon0]
>>> >> >> >>       host = mds0
>>> >> >> >>       mon addr = 192.168.89.133:6789
>>> >> >> >> [mds]
>>> >> >> >>       mds session timeout = 6000
>>> >> >> >>       mds session autoclose = 6000
>>> >> >> >>       mds client lease = 6000
>>> >> >> >>       keyring = /data/keyring.$name
>>> >> >> >> [mds0]
>>> >> >> >>       host = mds0
>>> >> >> >> [osd]
>>> >> >> >>       sudo = true
>>> >> >> >>       osd data = /data/osd$id
>>> >> >> >>       osd journal = /journal
>>> >> >> >>       osd journal size = 1024
>>> >> >> >>       filestore journal writeahead = true
>>> >> >> >> [osd0]
>>> >> >> >>       host = osd0
>>> >> >> >> [osd1]
>>> >> >> >>       host = osd1
>>> >> >> >> [group everyone]
>>> >> >> >>       addr = 0.0.0.0/0
>>> >> >> >> [mount]
>>> >> >> >>       allow = %everyone
>>> >> >> >> ;-----------------------------------end-----------------------------------
>>> >> >> >>
>>> >> >> >> The Ceph version I was using is 0.22.1.
>>> >> >> >>
>>> >> >> >> Can anyone help me to solve this problem? Thanks in advance!
>>> >> >> >> --
>>> >> >> >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> >> >> >> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>> >> >> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>> >> >> >>
>>> >> >> >
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >> --
>>> >> >> Yonggang Liu
>>> >> >> Advanced Computing and Information Systems Laboratory
>>> >> >> University of Florida
>>> >> >> --
>>> >> >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> >> >> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>> >> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>> >> >>
>>> >> >>
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Yonggang Liu
>>> >> Advanced Computing and Information Systems Laboratory
>>> >> University of Florida
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Yonggang Liu
>>> >> Advanced Computing and Information Systems Laboratory
>>> >> University of Florida
>>> >> --
>>> >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> >> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>> >>
>>> >>
>>>
>>>
>>>
>>> --
>>> Yonggang Liu
>>> Advanced Computing and Information Systems Laboratory
>>> University of Florida
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>>
>
>
>
> --
> Yonggang Liu
> Advanced Computing and Information Systems Laboratory
> University of Florida
>

-- 
Yonggang Liu
Advanced Computing and Information Systems Laboratory
University of Florida
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html