Thank you Sage. I solved the problem: I used the git code branch rbd-backport, and it did not give me any error when I run "insmod ceph.ko". Now I can mount ceph without any failure. I think that was a kernel compatibility problem. Thank you, On Thu, Oct 28, 2010 at 1:07 AM, Yonggang Liu <myidpt@xxxxxxxxx> wrote: > Hi Sage, > > I think I followed this page http://ceph.newdream.net/wiki/Debian to > install the kernel client module (I didn't use the git client > stand-alone code). > I updated the aptitude repo with: > deb http://ceph.newdream.net/debian/ lenny ceph-stable > deb-src http://ceph.newdream.net/debian/ lenny ceph-stable > I upgraded my kernel: > apt-get install linux-image-2.6.32-686 > And then I followed the instructions: > apt-get install ceph > apt-get install linux-headers-2.6.32-686 > cd /usr/src/modules/ceph > make > make modules_install > depmod > modprobe ceph > (That's what I did. Maybe the package I got is not suitable for this kernel?) > > > I also tried the git approach here > http://ceph.newdream.net/wiki/Building_kernel_client, but I met a > problem when I was running "insmod ceph.ko" to load the kernel. Here's > the error and the dmesg. > client:~/ceph-client-standalone# insmod ceph.ko > insmod: error inserting 'ceph.ko': -1 Unknown symbol in module > client:~/ceph-client-standalone# dmesg -c > [ 4295.877555] ceph: Unknown symbol task_dirty_inc > > The git commit I was on is: 0184d86d147911efe080dcbbfba40f0b3617659f > > Any suggestions? Thank you very much! > > > > > > On Wed, Oct 27, 2010 at 11:53 PM, Sage Weil <sage@xxxxxxxxxxxx> wrote: >> On Wed, 27 Oct 2010, Yonggang Liu wrote: >>> > In any case, the source of the offending message is >>> > >>> > 192.168.89.131:0/753148386 >>> > >>> > The large number at the end is just random. I'm guessing that's the >>> > kernel client mounting/trying to mount the file system. What kernel >>> > version, or version of the kernel module are you using? >>> > >>> Here's the kernel version of all my VMs: >>> mds0:~# uname -a >>> Linux mds0 2.6.32-5-686 #1 SMP Tue Oct 19 14:40:34 UTC 2010 i686 GNU/Linux >>> >>> The mount command I was using is: >>> mount -t ceph mds0:/ /ceph >>> Do you mean the failure is due to the client kernel? >> >> Right. Where is the ceph kernel module coming from? Are you building >> from ceph-client-standalone.git, or some other source? If it's from >> -standalone.git, what commit are you on? (git rev-parse HEAD) >> >> sage >> >> >> >> >>> >>> > Thanks- >>> > sage >>> > >>> >>> Thank you! >>> >>> > >>> > On Wed, 27 Oct 2010, Yonggang Liu wrote: >>> > >>> >> ---------- Forwarded message ---------- >>> >> From: Yonggang Liu <myidpt@xxxxxxxxx> >>> >> Date: Wed, Oct 27, 2010 at 7:32 PM >>> >> Subject: Re: mds port 6800 socket closed when accessing mount point >>> >> To: Sage Weil <sage@xxxxxxxxxxxx> >>> >> >>> >> >>> >> Hi Sage, >>> >> >>> >> Thank you very much for your reply. In your question, the PID 1978 is >>> >> not a running process; it is the PID Ceph considers the mds0 process >>> >> is using. But the real PID of mds0 was not that one (seem to be 1979). >>> >> I redid the process and gathered the results. I think I see a little >>> >> bit of the problem now. When I'm starting Ceph, the program prints out >>> >> the PIDs of the daemons running on each machine, but these PIDs are >>> >> not the exact PIDs of the real running daemons. >>> >> >>> >> I started the system and got the PID information: >>> >> mds0:~# /etc/init.d/ceph -a start >>> >> === mon.0 === >>> >> Starting Ceph mon0 on mds0... >>> >> ** WARNING: Ceph is still under heavy development, and is only suitable for ** >>> >> ** testing and review. Do not trust it with important data. ** >>> >> starting mon.0 rank 0 at 192.168.89.133:6789/0 mon_data /data/mon0 >>> >> fsid 9ff93f53-5e63-011c-4437-8e3f780cfcde >>> >> === mds.0 === >>> >> Starting Ceph mds0 on mds0... >>> >> ** WARNING: Ceph is still under heavy development, and is only suitable for ** >>> >> ** testing and review. Do not trust it with important data. ** >>> >> starting mds.0 at 0.0.0.0:6800/1824 >>> >> === osd.0 === >>> >> Starting Ceph osd0 on osd0... >>> >> ** WARNING: Ceph is still under heavy development, and is only suitable for ** >>> >> ** testing and review. Do not trust it with important data. ** >>> >> starting osd0 at 0.0.0.0:6800/1622 osd_data /data/osd0 /data/osd0/journal >>> >> === osd.1 === >>> >> Starting Ceph osd1 on osd1... >>> >> ** WARNING: Ceph is still under heavy development, and is only suitable for ** >>> >> ** testing and review. Do not trust it with important data. ** >>> >> starting osd1 at 0.0.0.0:6800/1616 osd_data /data/osd1 /data/osd1/journal >>> >> >>> >> So from the print, the PIDs known by Ceph are: >>> >> moon.0 0 >>> >> mds.0 1824 >>> >> osd.0 1622 >>> >> osd.1 1616 >>> >> >>> >> But I checked the PIDs by using "ps uax | grep ceph" on the 3 >>> >> machines, I got this: >>> >> On mds0: >>> >> root 1792 1.0 1.2 46412 3260 ? S<sl 20:14 0:07 >>> >> /usr/bin/cmon -i 0 -c /etc/ceph/ceph.conf >>> >> root 1825 28.3 1.2 72228 3076 ? S<sl 20:14 3:06 >>> >> /usr/bin/cmds -i 0 -c /etc/ceph/ceph.conf >>> >> On osd0: >>> >> root 1623 0.9 8.3 214820 21356 ? S<sl 20:07 0:01 >>> >> /usr/bin/cosd -i 0 -c /tmp/ceph.conf.1747 >>> >> On osd1: >>> >> root 1617 0.7 7.5 213780 19264 ? S<sl 20:07 0:01 >>> >> /usr/bin/cosd -i 1 -c /tmp/ceph.conf.1747 >>> >> (Note the second fields are the PIDs) >>> >> >>> >> Run "netstat -anp | grep 6800" on mds0, I got: >>> >> mds0:~# netstat -anp | grep 6800 >>> >> tcp 0 0 0.0.0.0:6800 0.0.0.0:* >>> >> LISTEN 1825/cmds >>> >> tcp 0 0 192.168.89.133:33216 192.168.89.135:6800 >>> >> ESTABLISHED 1825/cmds >>> >> tcp 0 0 192.168.89.133:56908 192.168.89.134:6800 >>> >> ESTABLISHED 1825/cmds >>> >> tcp 0 0 192.168.89.133:6800 192.168.89.131:45334 >>> >> ESTABLISHED 1825/cmds >>> >> (I can see PID 1825 is taking care of the client - mds0 communication) >>> >> >>> >> At the end of file "mds.0.log", I saw the PID 1824: >>> >> 2010-10-27 20:57:45.537898 b35e4b70 failed to decode message of type >>> >> 784 v2: buffer::end_of_buffer >>> >> 2010-10-27 20:57:45.538652 b35e4b70 -- 192.168.89.133:6800/1824 >> >>> >> 192.168.89.131:0/753148386 pipe(0x9f43488 sd=-1 pgs=9 cs=7 l=0).fault >>> >> with nothing to send, going to standby >>> >> 2010-10-27 20:58:18.044950 b30fdb70 -- 192.168.89.133:6800/1824 >> >>> >> 192.168.89.131:0/753148386 pipe(0x9f7cca8 sd=10 pgs=0 cs=0 l=0).accept >>> >> peer addr is really 192.168.89.131:0/753148386 (socket is >>> >> 192.168.89.131:45342/0) >>> >> 2010-10-27 20:58:18.045247 b30fdb70 -- 192.168.89.133:6800/1824 >> >>> >> 192.168.89.131:0/753148386 pipe(0x9f7cca8 sd=10 pgs=0 cs=0 l=0).accept >>> >> connect_seq 7 vs existing 7 state 3 >>> >> 2010-10-27 20:58:18.053628 b30fdb70 failed to decode message of type >>> >> 784 v2: buffer::end_of_buffer >>> >> 2010-10-27 20:58:18.054824 b30fdb70 -- 192.168.89.133:6800/1824 >> >>> >> 192.168.89.131:0/753148386 pipe(0x9f7cca8 sd=-1 pgs=10 cs=8 l=0).fault >>> >> initiating reconnect >>> >> >>> >> So this is strange. It seems the PIDs known by Ceph are not the real >>> >> PIDs of the processes. I don't know how this could happen. >>> >> >>> >> The version of the Ceph code is: >>> >> mds0:~# ceph --version >>> >> ceph version 0.22.1 (commit:7464f9688001aa89f9673ba14e6d075d0ee33541) >>> >> >>> >> Thank you very much! >>> >> >>> >> >>> >> On Wed, Oct 27, 2010 at 4:47 PM, Sage Weil <sage@xxxxxxxxxxxx> wrote: >>> >> > Hi, >>> >> > >>> >> > On Wed, 27 Oct 2010, Yonggang Liu wrote: >>> >> >> 2010-10-27 15:56:23.332473 b352eb70 failed to decode message of type >>> >> >> 784 v4865: buffer::end_of_buffer >>> >> > >>> >> > This is the problem. Can you tell what 192.168.89.133:6800/1978 is? >>> >> > (That 1978 is a pid, btw.) It's probably a 'cfuse' I'm guessing? (The >>> >> > ceph daemons all start with 'c', but don't have 'ceph' in the name.) >>> >> > >>> >> > 784 is a client_caps message. The 'v4865' looks like an uninitialized >>> >> > variable to me, as it should be a small integer. >>> >> > >>> >> > Whatever daemon that is, can you run it with the '-v' argument to see >>> >> > exactly which version it is? >>> >> > >>> >> > Thanks! >>> >> > sage >>> >> > >>> >> > >>> >> > >>> >> > >>> >> >> 2010-10-27 15:56:23.332730 b352eb70 -- 192.168.89.133:6800/1978 >> >>> >> >> 192.168.89.131:0/1344076202 pipe(0x9fd0168 sd=-1 pgs=10 cs=8 >>> >> >> l=0).fault with nothing to send, going to standby >>> >> >> 2010-10-27 15:56:52.022573 b332cb70 -- 192.168.89.133:6800/1978 >> >>> >> >> 192.168.89.131:0/1344076202 pipe(0x9fd1720 sd=12 pgs=0 cs=0 >>> >> >> l=0).accept peer addr is really 192.168.89.131:0/1344076202 (socket is >>> >> >> 192.168.89.131:41726/0) >>> >> >> 2010-10-27 15:56:52.022945 b332cb70 -- 192.168.89.133:6800/1978 >> >>> >> >> 192.168.89.131:0/1344076202 pipe(0x9fd1720 sd=12 pgs=0 cs=0 >>> >> >> l=0).accept connect_seq 8 vs existing 8 state 3 >>> >> >> 2010-10-27 15:56:52.025299 b332cb70 failed to decode message of type >>> >> >> 784 v4865: buffer::end_of_buffer >>> >> >> 2010-10-27 15:56:52.025569 b332cb70 -- 192.168.89.133:6800/1978 >> >>> >> >> 192.168.89.131:0/1344076202 pipe(0x9fd1720 sd=-1 pgs=11 cs=9 >>> >> >> l=0).fault with nothing to send, going to standby >>> >> >> 2010-10-27 15:57:48.106337 b362fb70 -- 192.168.89.133:6800/1978 >> >>> >> >> 192.168.89.131:0/1344076202 pipe(0x9fd07c0 sd=12 pgs=0 cs=0 >>> >> >> l=0).accept peer addr is really 192.168.89.131:0/1344076202 (socket is >>> >> >> 192.168.89.131:41727/0) >>> >> >> 2010-10-27 15:57:48.106522 b362fb70 -- 192.168.89.133:6800/1978 >> >>> >> >> 192.168.89.131:0/1344076202 pipe(0x9fd07c0 sd=12 pgs=0 cs=0 >>> >> >> l=0).accept connect_seq 9 vs existing 9 state 3 >>> >> >> 2010-10-27 15:57:48.109498 b362fb70 failed to decode message of type >>> >> >> 784 v4865: buffer::end_of_buffer >>> >> >> 2010-10-27 15:57:48.109761 b362fb70 -- 192.168.89.133:6800/1978 >> >>> >> >> 192.168.89.131:0/1344076202 pipe(0x9fd07c0 sd=-1 pgs=12 cs=10 >>> >> >> l=0).fault with nothing to send, going to standby >>> >> >> >>> >> >> On osd0 >>> >> >> osd.0.log: >>> >> >> 2010-10-27 15:55:01.470502 --- 1633 opened log /var/log/ceph/osd.0.log --- >>> >> >> ceph version 0.22.1 (commit:7464f9688001aa89f9673ba14e6d075d0ee33541) >>> >> >> 2010-10-27 15:55:01.485828 b72d38e0 filestore(/data/osd0) mkfs in /data/osd0 >>> >> >> 2010-10-27 15:55:01.486106 b72d38e0 filestore(/data/osd0) mkfs >>> >> >> removing old file fsid >>> >> >> 2010-10-27 15:55:01.516527 b72d38e0 filestore(/data/osd0) mkjournal >>> >> >> created journal on /journal >>> >> >> 2010-10-27 15:55:01.516734 b72d38e0 filestore(/data/osd0) mkfs done in >>> >> >> /data/osd0 >>> >> >> 2010-10-27 15:55:01.519606 b72d38e0 filestore(/data/osd0) mount did >>> >> >> NOT detect btrfs >>> >> >> 2010-10-27 15:55:01.519794 b72d38e0 filestore(/data/osd0) mount found snaps <> >>> >> >> 2010-10-27 15:55:01.548307 b5acfb70 FileStore::op_tp worker finish >>> >> >> 2010-10-27 15:55:01.548455 b52ceb70 FileStore::op_tp worker finish >>> >> >> 2010-10-27 15:55:01.548812 b72d38e0 journal close /journal >>> >> >> 2010-10-27 15:55:25.201288 --- 1656 opened log /var/log/ceph/osd.0.log --- >>> >> >> ceph version 0.22.1 (commit:7464f9688001aa89f9673ba14e6d075d0ee33541) >>> >> >> 2010-10-27 15:55:25.223381 b74508e0 filestore(/data/osd0) mount did >>> >> >> NOT detect btrfs >>> >> >> 2010-10-27 15:55:25.224253 b74508e0 filestore(/data/osd0) mount found snaps <> >>> >> >> 2010-10-27 15:55:25.225396 b74508e0 journal read_entry 4096 : seq 1 203 bytes >>> >> >> 2010-10-27 15:55:26.841749 abffeb70 -- 0.0.0.0:6801/1656 >> >>> >> >> 192.168.89.135:6801/1675 pipe(0x94ced18 sd=13 pgs=0 cs=0 l=0).connect >>> >> >> claims to be 0.0.0.0:6801/1675 not 192.168.89.135:6801/1675 - >>> >> >> presumably this is the same node! >>> >> >> 2010-10-27 15:56:04.219613 abaf9b70 -- 192.168.89.134:6800/1656 >> >>> >> >> 192.168.89.131:0/1344076202 pipe(0x96463e0 sd=16 pgs=0 cs=0 >>> >> >> l=0).accept peer addr is really 192.168.89.131:0/1344076202 (socket is >>> >> >> 192.168.89.131:36746/0) >>> >> >> >>> >> >> On osd1 >>> >> >> osd.1.log: >>> >> >> 2010-10-27 15:54:59.752615 --- 1652 opened log /var/log/ceph/osd.1.log --- >>> >> >> ceph version 0.22.1 (commit:7464f9688001aa89f9673ba14e6d075d0ee33541) >>> >> >> 2010-10-27 15:54:59.766128 b73518e0 filestore(/data/osd1) mkfs in /data/osd1 >>> >> >> 2010-10-27 15:54:59.766658 b73518e0 filestore(/data/osd1) mkfs >>> >> >> removing old file fsid >>> >> >> 2010-10-27 15:54:59.796938 b73518e0 filestore(/data/osd1) mkjournal >>> >> >> created journal on /journal >>> >> >> 2010-10-27 15:54:59.797816 b73518e0 filestore(/data/osd1) mkfs done in >>> >> >> /data/osd1 >>> >> >> 2010-10-27 15:54:59.800957 b73518e0 filestore(/data/osd1) mount did >>> >> >> NOT detect btrfs >>> >> >> 2010-10-27 15:54:59.801087 b73518e0 filestore(/data/osd1) mount found snaps <> >>> >> >> 2010-10-27 15:54:59.832202 b534cb70 FileStore::op_tp worker finish >>> >> >> 2010-10-27 15:54:59.832504 b5b4db70 FileStore::op_tp worker finish >>> >> >> 2010-10-27 15:54:59.832723 b73518e0 journal close /journal >>> >> >> 2010-10-27 15:55:23.050042 --- 1675 opened log /var/log/ceph/osd.1.log --- >>> >> >> ceph version 0.22.1 (commit:7464f9688001aa89f9673ba14e6d075d0ee33541) >>> >> >> 2010-10-27 15:55:23.056671 b72e18e0 filestore(/data/osd1) mount did >>> >> >> NOT detect btrfs >>> >> >> 2010-10-27 15:55:23.056921 b72e18e0 filestore(/data/osd1) mount found snaps <> >>> >> >> 2010-10-27 15:55:23.057368 b72e18e0 journal read_entry 4096 : seq 1 203 bytes >>> >> >> 2010-10-27 15:55:23.207216 b12d4b70 osd1 2 map says i am down or have >>> >> >> a different address. switching to boot state. >>> >> >> 2010-10-27 15:55:23.207540 b12d4b70 log [WRN] : map e2 wrongly marked me down >>> >> >> >>> >> >> I noticed on osd1, the last two lines of osd.1.log are unusual, but >>> >> >> I'm not sure if they are the reason to the problem ... >>> >> >> >>> >> >> Thank you very much, >>> >> >> >>> >> >> >>> >> >> On Wed, Oct 27, 2010 at 1:10 PM, Colin McCabe <cmccabe@xxxxxxxxxxxxxx> wrote: >>> >> >> > Hi Yonggang, >>> >> >> > >>> >> >> > Are all of the daemons still running? What is at the end of the logfiles? >>> >> >> > >>> >> >> > regards, >>> >> >> > Colin >>> >> >> > >>> >> >> > >>> >> >> > On Wed, Oct 27, 2010 at 9:42 AM, Yonggang Liu <myidpt@xxxxxxxxx> wrote: >>> >> >> >> Hello, >>> >> >> >> >>> >> >> >> I'm totally new to Ceph. Last a few days I set up 4 VMs to run Ceph: >>> >> >> >> "mds0" for the metadata server and monitor, "osd0" and "osd1" for two >>> >> >> >> data servers, and "client" for the client machine. The VMs are running >>> >> >> >> Debian 5.0 with kernel 2.6.32-5-686 (Ceph module is enabled). >>> >> >> >> I followed "Building kernel client" and "Debian" from the wiki, and I >>> >> >> >> was able to start Ceph and mount Ceph at the client. But the problem >>> >> >> >> is, the mounted point always fail with an infinite response time >>> >> >> >> (after I mount Ceph for about 1 min or less). To illustrate it better, >>> >> >> >> I will show you the information I got on the client and mds0 machines: >>> >> >> >> >>> >> >> >> mds0 (192.168.89.133): >>> >> >> >> debian:~# mkcephfs -c /etc/ceph/ceph.conf --allhosts -v >>> >> >> >> (A lot of info) >>> >> >> >> debian:~# /etc/init.d/ceph -a start >>> >> >> >> (some info) >>> >> >> >> >>> >> >> >> client (192.168.89.131): >>> >> >> >> debian:~# mount -t ceph 192.168.89.133:/ /ceph >>> >> >> >> debian:~# cd /ceph >>> >> >> >> debian:/ceph# cp ~/app_ch.xls . >>> >> >> >> debian:/ceph# ls >>> >> >> >> (waiting for ever) >>> >> >> >> ^C >>> >> >> >> >>> >> >> >> After the failure I ran dmesg at the client side and got: >>> >> >> >> client (192.168.89.131): >>> >> >> >> debian:/ceph# dmesg -c >>> >> >> >> [ 636.664425] ceph: loaded (mon/mds/osd proto 15/32/24, osdmap 5/5 5/5) >>> >> >> >> [ 636.694973] ceph: client4100 fsid 423ad64c-bbf0-3011-bb47-36a89f8787c6 >>> >> >> >> [ 636.700716] ceph: mon0 192.168.89.133:6789 session established >>> >> >> >> [ 664.114551] ceph: mds0 192.168.89.133:6800 socket closed >>> >> >> >> [ 664.848722] ceph: mds0 192.168.89.133:6800 socket closed >>> >> >> >> [ 665.914923] ceph: mds0 192.168.89.133:6800 socket closed >>> >> >> >> [ 667.840396] ceph: mds0 192.168.89.133:6800 socket closed >>> >> >> >> [ 672.054106] ceph: mds0 192.168.89.133:6800 socket closed >>> >> >> >> [ 680.894531] ceph: mds0 192.168.89.133:6800 socket closed >>> >> >> >> [ 696.928496] ceph: mds0 192.168.89.133:6800 socket closed >>> >> >> >> [ 720.171754] ceph: mds0 caps stale >>> >> >> >> [ 728.999701] ceph: mds0 192.168.89.133:6800 socket closed >>> >> >> >> [ 794.640943] ceph: mds0 192.168.89.133:6800 socket closed >>> >> >> >> >>> >> >> >> Immediately after the failure, I ran netstat at mds0: >>> >> >> >> mds0 (192.168.89.133): >>> >> >> >> debian:~# netstat -anp >>> >> >> >> Active Internet connections (servers and established) >>> >> >> >> Proto Recv-Q Send-Q Local Address Foreign Address >>> >> >> >> State PID/Program name >>> >> >> >> tcp 0 0 0.0.0.0:6800 0.0.0.0:* >>> >> >> >> LISTEN 1889/cmds >>> >> >> >> tcp 0 0 0.0.0.0:22 0.0.0.0:* >>> >> >> >> LISTEN 1529/sshd >>> >> >> >> tcp 0 0 192.168.89.133:6789 0.0.0.0:* >>> >> >> >> LISTEN 1840/cmon >>> >> >> >> tcp 0 0 192.168.89.133:6789 192.168.89.131:56855 >>> >> >> >> ESTABLISHED 1840/cmon >>> >> >> >> tcp 0 0 192.168.89.133:43647 192.168.89.133:6789 >>> >> >> >> ESTABLISHED 1889/cmds >>> >> >> >> tcp 0 0 192.168.89.133:22 192.168.89.1:58304 >>> >> >> >> ESTABLISHED 1530/0 >>> >> >> >> tcp 0 0 192.168.89.133:39826 192.168.89.134:6800 >>> >> >> >> ESTABLISHED 1889/cmds >>> >> >> >> tcp 0 0 192.168.89.133:6789 192.168.89.134:41289 >>> >> >> >> ESTABLISHED 1840/cmon >>> >> >> >> tcp 0 0 192.168.89.133:6800 192.168.89.131:52814 >>> >> >> >> TIME_WAIT - >>> >> >> >> tcp 0 0 192.168.89.133:6789 192.168.89.135:41021 >>> >> >> >> ESTABLISHED 1840/cmon >>> >> >> >> tcp 0 0 192.168.89.133:42069 192.168.89.135:6800 >>> >> >> >> ESTABLISHED 1889/cmds >>> >> >> >> tcp 0 0 192.168.89.133:6789 192.168.89.133:43647 >>> >> >> >> ESTABLISHED 1840/cmon >>> >> >> >> tcp 0 0 192.168.89.133:6800 192.168.89.131:52815 >>> >> >> >> TIME_WAIT - >>> >> >> >> tcp 0 0 192.168.89.133:6800 192.168.89.131:52816 >>> >> >> >> TIME_WAIT - >>> >> >> >> tcp6 0 0 :::22 :::* >>> >> >> >> LISTEN 1529/sshd >>> >> >> >> udp 0 0 0.0.0.0:68 0.0.0.0:* >>> >> >> >> 1490/dhclient3 >>> >> >> >> Active UNIX domain sockets (servers and established) >>> >> >> >> Proto RefCnt Flags Type State I-Node PID/Program >>> >> >> >> name Path >>> >> >> >> unix 2 [ ] DGRAM 2972 546/udevd >>> >> >> >> @/org/kernel/udev/udevd >>> >> >> >> unix 4 [ ] DGRAM 5343 >>> >> >> >> 1358/rsyslogd /dev/log >>> >> >> >> unix 2 [ ] DGRAM 5662 1530/0 >>> >> >> >> unix 2 [ ] DGRAM 5486 1490/dhclient3 >>> >> >> >> debian:~# >>> >> >> >> debian:~# dmesg -c >>> >> >> >> debian:~# (nothing shows up) >>> >> >> >> >>> >> >> >> I saw the port 6800 on the metadata server talking with the client is >>> >> >> >> on "TIME_WAIT" stage. That means the connection is closed. >>> >> >> >> This is the ceph.conf I have: >>> >> >> >> [global] >>> >> >> >> pid file = /var/run/ceph/$type.$id.pid >>> >> >> >> [mon] >>> >> >> >> mon data = /data/mon$id >>> >> >> >> mon subscribe interval = 6000 >>> >> >> >> mon osd down out interval = 6000 >>> >> >> >> [mon0] >>> >> >> >> host = mds0 >>> >> >> >> mon addr = 192.168.89.133:6789 >>> >> >> >> [mds] >>> >> >> >> mds session timeout = 6000 >>> >> >> >> mds session autoclose = 6000 >>> >> >> >> mds client lease = 6000 >>> >> >> >> keyring = /data/keyring.$name >>> >> >> >> [mds0] >>> >> >> >> host = mds0 >>> >> >> >> [osd] >>> >> >> >> sudo = true >>> >> >> >> osd data = /data/osd$id >>> >> >> >> osd journal = /journal >>> >> >> >> osd journal size = 1024 >>> >> >> >> filestore journal writeahead = true >>> >> >> >> [osd0] >>> >> >> >> host = osd0 >>> >> >> >> [osd1] >>> >> >> >> host = osd1 >>> >> >> >> [group everyone] >>> >> >> >> addr = 0.0.0.0/0 >>> >> >> >> [mount] >>> >> >> >> allow = %everyone >>> >> >> >> ;-----------------------------------end----------------------------------- >>> >> >> >> >>> >> >> >> The Ceph version I was using is 0.22.1. >>> >> >> >> >>> >> >> >> Can anyone help me to solve this problem? Thanks in advance! >>> >> >> >> -- >>> >> >> >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>> >> >> >> the body of a message to majordomo@xxxxxxxxxxxxxxx >>> >> >> >> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> >> >> >> >>> >> >> > >>> >> >> >>> >> >> >>> >> >> >>> >> >> -- >>> >> >> Yonggang Liu >>> >> >> Advanced Computing and Information Systems Laboratory >>> >> >> University of Florida >>> >> >> -- >>> >> >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>> >> >> the body of a message to majordomo@xxxxxxxxxxxxxxx >>> >> >> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> >> >> >>> >> >> >>> >> >>> >> >>> >> >>> >> -- >>> >> Yonggang Liu >>> >> Advanced Computing and Information Systems Laboratory >>> >> University of Florida >>> >> >>> >> >>> >> >>> >> -- >>> >> Yonggang Liu >>> >> Advanced Computing and Information Systems Laboratory >>> >> University of Florida >>> >> -- >>> >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>> >> the body of a message to majordomo@xxxxxxxxxxxxxxx >>> >> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> >> >>> >> >>> >>> >>> >>> -- >>> Yonggang Liu >>> Advanced Computing and Information Systems Laboratory >>> University of Florida >>> -- >>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>> the body of a message to majordomo@xxxxxxxxxxxxxxx >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> >>> > > > > -- > Yonggang Liu > Advanced Computing and Information Systems Laboratory > University of Florida > -- Yonggang Liu Advanced Computing and Information Systems Laboratory University of Florida -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html