---------- Forwarded message ---------- From: Yonggang Liu <myidpt@xxxxxxxxx> Date: Wed, Oct 27, 2010 at 7:32 PM Subject: Re: mds port 6800 socket closed when accessing mount point To: Sage Weil <sage@xxxxxxxxxxxx> Hi Sage, Thank you very much for your reply. In your question, the PID 1978 is not a running process; it is the PID Ceph considers the mds0 process is using. But the real PID of mds0 was not that one (seem to be 1979). I redid the process and gathered the results. I think I see a little bit of the problem now. When I'm starting Ceph, the program prints out the PIDs of the daemons running on each machine, but these PIDs are not the exact PIDs of the real running daemons. I started the system and got the PID information: mds0:~# /etc/init.d/ceph -a start === mon.0 === Starting Ceph mon0 on mds0... ** WARNING: Ceph is still under heavy development, and is only suitable for ** ** testing and review. Do not trust it with important data. ** starting mon.0 rank 0 at 192.168.89.133:6789/0 mon_data /data/mon0 fsid 9ff93f53-5e63-011c-4437-8e3f780cfcde === mds.0 === Starting Ceph mds0 on mds0... ** WARNING: Ceph is still under heavy development, and is only suitable for ** ** testing and review. Do not trust it with important data. ** starting mds.0 at 0.0.0.0:6800/1824 === osd.0 === Starting Ceph osd0 on osd0... ** WARNING: Ceph is still under heavy development, and is only suitable for ** ** testing and review. Do not trust it with important data. ** starting osd0 at 0.0.0.0:6800/1622 osd_data /data/osd0 /data/osd0/journal === osd.1 === Starting Ceph osd1 on osd1... ** WARNING: Ceph is still under heavy development, and is only suitable for ** ** testing and review. Do not trust it with important data. ** starting osd1 at 0.0.0.0:6800/1616 osd_data /data/osd1 /data/osd1/journal So from the print, the PIDs known by Ceph are: moon.0 0 mds.0 1824 osd.0 1622 osd.1 1616 But I checked the PIDs by using "ps uax | grep ceph" on the 3 machines, I got this: On mds0: root 1792 1.0 1.2 46412 3260 ? S<sl 20:14 0:07 /usr/bin/cmon -i 0 -c /etc/ceph/ceph.conf root 1825 28.3 1.2 72228 3076 ? S<sl 20:14 3:06 /usr/bin/cmds -i 0 -c /etc/ceph/ceph.conf On osd0: root 1623 0.9 8.3 214820 21356 ? S<sl 20:07 0:01 /usr/bin/cosd -i 0 -c /tmp/ceph.conf.1747 On osd1: root 1617 0.7 7.5 213780 19264 ? S<sl 20:07 0:01 /usr/bin/cosd -i 1 -c /tmp/ceph.conf.1747 (Note the second fields are the PIDs) Run "netstat -anp | grep 6800" on mds0, I got: mds0:~# netstat -anp | grep 6800 tcp 0 0 0.0.0.0:6800 0.0.0.0:* LISTEN 1825/cmds tcp 0 0 192.168.89.133:33216 192.168.89.135:6800 ESTABLISHED 1825/cmds tcp 0 0 192.168.89.133:56908 192.168.89.134:6800 ESTABLISHED 1825/cmds tcp 0 0 192.168.89.133:6800 192.168.89.131:45334 ESTABLISHED 1825/cmds (I can see PID 1825 is taking care of the client - mds0 communication) At the end of file "mds.0.log", I saw the PID 1824: 2010-10-27 20:57:45.537898 b35e4b70 failed to decode message of type 784 v2: buffer::end_of_buffer 2010-10-27 20:57:45.538652 b35e4b70 -- 192.168.89.133:6800/1824 >> 192.168.89.131:0/753148386 pipe(0x9f43488 sd=-1 pgs=9 cs=7 l=0).fault with nothing to send, going to standby 2010-10-27 20:58:18.044950 b30fdb70 -- 192.168.89.133:6800/1824 >> 192.168.89.131:0/753148386 pipe(0x9f7cca8 sd=10 pgs=0 cs=0 l=0).accept peer addr is really 192.168.89.131:0/753148386 (socket is 192.168.89.131:45342/0) 2010-10-27 20:58:18.045247 b30fdb70 -- 192.168.89.133:6800/1824 >> 192.168.89.131:0/753148386 pipe(0x9f7cca8 sd=10 pgs=0 cs=0 l=0).accept connect_seq 7 vs existing 7 state 3 2010-10-27 20:58:18.053628 b30fdb70 failed to decode message of type 784 v2: buffer::end_of_buffer 2010-10-27 20:58:18.054824 b30fdb70 -- 192.168.89.133:6800/1824 >> 192.168.89.131:0/753148386 pipe(0x9f7cca8 sd=-1 pgs=10 cs=8 l=0).fault initiating reconnect So this is strange. It seems the PIDs known by Ceph are not the real PIDs of the processes. I don't know how this could happen. The version of the Ceph code is: mds0:~# ceph --version ceph version 0.22.1 (commit:7464f9688001aa89f9673ba14e6d075d0ee33541) Thank you very much! On Wed, Oct 27, 2010 at 4:47 PM, Sage Weil <sage@xxxxxxxxxxxx> wrote: > Hi, > > On Wed, 27 Oct 2010, Yonggang Liu wrote: >> 2010-10-27 15:56:23.332473 b352eb70 failed to decode message of type >> 784 v4865: buffer::end_of_buffer > > This is the problem. Can you tell what 192.168.89.133:6800/1978 is? > (That 1978 is a pid, btw.) It's probably a 'cfuse' I'm guessing? (The > ceph daemons all start with 'c', but don't have 'ceph' in the name.) > > 784 is a client_caps message. The 'v4865' looks like an uninitialized > variable to me, as it should be a small integer. > > Whatever daemon that is, can you run it with the '-v' argument to see > exactly which version it is? > > Thanks! > sage > > > > >> 2010-10-27 15:56:23.332730 b352eb70 -- 192.168.89.133:6800/1978 >> >> 192.168.89.131:0/1344076202 pipe(0x9fd0168 sd=-1 pgs=10 cs=8 >> l=0).fault with nothing to send, going to standby >> 2010-10-27 15:56:52.022573 b332cb70 -- 192.168.89.133:6800/1978 >> >> 192.168.89.131:0/1344076202 pipe(0x9fd1720 sd=12 pgs=0 cs=0 >> l=0).accept peer addr is really 192.168.89.131:0/1344076202 (socket is >> 192.168.89.131:41726/0) >> 2010-10-27 15:56:52.022945 b332cb70 -- 192.168.89.133:6800/1978 >> >> 192.168.89.131:0/1344076202 pipe(0x9fd1720 sd=12 pgs=0 cs=0 >> l=0).accept connect_seq 8 vs existing 8 state 3 >> 2010-10-27 15:56:52.025299 b332cb70 failed to decode message of type >> 784 v4865: buffer::end_of_buffer >> 2010-10-27 15:56:52.025569 b332cb70 -- 192.168.89.133:6800/1978 >> >> 192.168.89.131:0/1344076202 pipe(0x9fd1720 sd=-1 pgs=11 cs=9 >> l=0).fault with nothing to send, going to standby >> 2010-10-27 15:57:48.106337 b362fb70 -- 192.168.89.133:6800/1978 >> >> 192.168.89.131:0/1344076202 pipe(0x9fd07c0 sd=12 pgs=0 cs=0 >> l=0).accept peer addr is really 192.168.89.131:0/1344076202 (socket is >> 192.168.89.131:41727/0) >> 2010-10-27 15:57:48.106522 b362fb70 -- 192.168.89.133:6800/1978 >> >> 192.168.89.131:0/1344076202 pipe(0x9fd07c0 sd=12 pgs=0 cs=0 >> l=0).accept connect_seq 9 vs existing 9 state 3 >> 2010-10-27 15:57:48.109498 b362fb70 failed to decode message of type >> 784 v4865: buffer::end_of_buffer >> 2010-10-27 15:57:48.109761 b362fb70 -- 192.168.89.133:6800/1978 >> >> 192.168.89.131:0/1344076202 pipe(0x9fd07c0 sd=-1 pgs=12 cs=10 >> l=0).fault with nothing to send, going to standby >> >> On osd0 >> osd.0.log: >> 2010-10-27 15:55:01.470502 --- 1633 opened log /var/log/ceph/osd.0.log --- >> ceph version 0.22.1 (commit:7464f9688001aa89f9673ba14e6d075d0ee33541) >> 2010-10-27 15:55:01.485828 b72d38e0 filestore(/data/osd0) mkfs in /data/osd0 >> 2010-10-27 15:55:01.486106 b72d38e0 filestore(/data/osd0) mkfs >> removing old file fsid >> 2010-10-27 15:55:01.516527 b72d38e0 filestore(/data/osd0) mkjournal >> created journal on /journal >> 2010-10-27 15:55:01.516734 b72d38e0 filestore(/data/osd0) mkfs done in >> /data/osd0 >> 2010-10-27 15:55:01.519606 b72d38e0 filestore(/data/osd0) mount did >> NOT detect btrfs >> 2010-10-27 15:55:01.519794 b72d38e0 filestore(/data/osd0) mount found snaps <> >> 2010-10-27 15:55:01.548307 b5acfb70 FileStore::op_tp worker finish >> 2010-10-27 15:55:01.548455 b52ceb70 FileStore::op_tp worker finish >> 2010-10-27 15:55:01.548812 b72d38e0 journal close /journal >> 2010-10-27 15:55:25.201288 --- 1656 opened log /var/log/ceph/osd.0.log --- >> ceph version 0.22.1 (commit:7464f9688001aa89f9673ba14e6d075d0ee33541) >> 2010-10-27 15:55:25.223381 b74508e0 filestore(/data/osd0) mount did >> NOT detect btrfs >> 2010-10-27 15:55:25.224253 b74508e0 filestore(/data/osd0) mount found snaps <> >> 2010-10-27 15:55:25.225396 b74508e0 journal read_entry 4096 : seq 1 203 bytes >> 2010-10-27 15:55:26.841749 abffeb70 -- 0.0.0.0:6801/1656 >> >> 192.168.89.135:6801/1675 pipe(0x94ced18 sd=13 pgs=0 cs=0 l=0).connect >> claims to be 0.0.0.0:6801/1675 not 192.168.89.135:6801/1675 - >> presumably this is the same node! >> 2010-10-27 15:56:04.219613 abaf9b70 -- 192.168.89.134:6800/1656 >> >> 192.168.89.131:0/1344076202 pipe(0x96463e0 sd=16 pgs=0 cs=0 >> l=0).accept peer addr is really 192.168.89.131:0/1344076202 (socket is >> 192.168.89.131:36746/0) >> >> On osd1 >> osd.1.log: >> 2010-10-27 15:54:59.752615 --- 1652 opened log /var/log/ceph/osd.1.log --- >> ceph version 0.22.1 (commit:7464f9688001aa89f9673ba14e6d075d0ee33541) >> 2010-10-27 15:54:59.766128 b73518e0 filestore(/data/osd1) mkfs in /data/osd1 >> 2010-10-27 15:54:59.766658 b73518e0 filestore(/data/osd1) mkfs >> removing old file fsid >> 2010-10-27 15:54:59.796938 b73518e0 filestore(/data/osd1) mkjournal >> created journal on /journal >> 2010-10-27 15:54:59.797816 b73518e0 filestore(/data/osd1) mkfs done in >> /data/osd1 >> 2010-10-27 15:54:59.800957 b73518e0 filestore(/data/osd1) mount did >> NOT detect btrfs >> 2010-10-27 15:54:59.801087 b73518e0 filestore(/data/osd1) mount found snaps <> >> 2010-10-27 15:54:59.832202 b534cb70 FileStore::op_tp worker finish >> 2010-10-27 15:54:59.832504 b5b4db70 FileStore::op_tp worker finish >> 2010-10-27 15:54:59.832723 b73518e0 journal close /journal >> 2010-10-27 15:55:23.050042 --- 1675 opened log /var/log/ceph/osd.1.log --- >> ceph version 0.22.1 (commit:7464f9688001aa89f9673ba14e6d075d0ee33541) >> 2010-10-27 15:55:23.056671 b72e18e0 filestore(/data/osd1) mount did >> NOT detect btrfs >> 2010-10-27 15:55:23.056921 b72e18e0 filestore(/data/osd1) mount found snaps <> >> 2010-10-27 15:55:23.057368 b72e18e0 journal read_entry 4096 : seq 1 203 bytes >> 2010-10-27 15:55:23.207216 b12d4b70 osd1 2 map says i am down or have >> a different address. switching to boot state. >> 2010-10-27 15:55:23.207540 b12d4b70 log [WRN] : map e2 wrongly marked me down >> >> I noticed on osd1, the last two lines of osd.1.log are unusual, but >> I'm not sure if they are the reason to the problem ... >> >> Thank you very much, >> >> >> On Wed, Oct 27, 2010 at 1:10 PM, Colin McCabe <cmccabe@xxxxxxxxxxxxxx> wrote: >> > Hi Yonggang, >> > >> > Are all of the daemons still running? What is at the end of the logfiles? >> > >> > regards, >> > Colin >> > >> > >> > On Wed, Oct 27, 2010 at 9:42 AM, Yonggang Liu <myidpt@xxxxxxxxx> wrote: >> >> Hello, >> >> >> >> I'm totally new to Ceph. Last a few days I set up 4 VMs to run Ceph: >> >> "mds0" for the metadata server and monitor, "osd0" and "osd1" for two >> >> data servers, and "client" for the client machine. The VMs are running >> >> Debian 5.0 with kernel 2.6.32-5-686 (Ceph module is enabled). >> >> I followed "Building kernel client" and "Debian" from the wiki, and I >> >> was able to start Ceph and mount Ceph at the client. But the problem >> >> is, the mounted point always fail with an infinite response time >> >> (after I mount Ceph for about 1 min or less). To illustrate it better, >> >> I will show you the information I got on the client and mds0 machines: >> >> >> >> mds0 (192.168.89.133): >> >> debian:~# mkcephfs -c /etc/ceph/ceph.conf --allhosts -v >> >> (A lot of info) >> >> debian:~# /etc/init.d/ceph -a start >> >> (some info) >> >> >> >> client (192.168.89.131): >> >> debian:~# mount -t ceph 192.168.89.133:/ /ceph >> >> debian:~# cd /ceph >> >> debian:/ceph# cp ~/app_ch.xls . >> >> debian:/ceph# ls >> >> (waiting for ever) >> >> ^C >> >> >> >> After the failure I ran dmesg at the client side and got: >> >> client (192.168.89.131): >> >> debian:/ceph# dmesg -c >> >> [ 636.664425] ceph: loaded (mon/mds/osd proto 15/32/24, osdmap 5/5 5/5) >> >> [ 636.694973] ceph: client4100 fsid 423ad64c-bbf0-3011-bb47-36a89f8787c6 >> >> [ 636.700716] ceph: mon0 192.168.89.133:6789 session established >> >> [ 664.114551] ceph: mds0 192.168.89.133:6800 socket closed >> >> [ 664.848722] ceph: mds0 192.168.89.133:6800 socket closed >> >> [ 665.914923] ceph: mds0 192.168.89.133:6800 socket closed >> >> [ 667.840396] ceph: mds0 192.168.89.133:6800 socket closed >> >> [ 672.054106] ceph: mds0 192.168.89.133:6800 socket closed >> >> [ 680.894531] ceph: mds0 192.168.89.133:6800 socket closed >> >> [ 696.928496] ceph: mds0 192.168.89.133:6800 socket closed >> >> [ 720.171754] ceph: mds0 caps stale >> >> [ 728.999701] ceph: mds0 192.168.89.133:6800 socket closed >> >> [ 794.640943] ceph: mds0 192.168.89.133:6800 socket closed >> >> >> >> Immediately after the failure, I ran netstat at mds0: >> >> mds0 (192.168.89.133): >> >> debian:~# netstat -anp >> >> Active Internet connections (servers and established) >> >> Proto Recv-Q Send-Q Local Address Foreign Address >> >> State PID/Program name >> >> tcp 0 0 0.0.0.0:6800 0.0.0.0:* >> >> LISTEN 1889/cmds >> >> tcp 0 0 0.0.0.0:22 0.0.0.0:* >> >> LISTEN 1529/sshd >> >> tcp 0 0 192.168.89.133:6789 0.0.0.0:* >> >> LISTEN 1840/cmon >> >> tcp 0 0 192.168.89.133:6789 192.168.89.131:56855 >> >> ESTABLISHED 1840/cmon >> >> tcp 0 0 192.168.89.133:43647 192.168.89.133:6789 >> >> ESTABLISHED 1889/cmds >> >> tcp 0 0 192.168.89.133:22 192.168.89.1:58304 >> >> ESTABLISHED 1530/0 >> >> tcp 0 0 192.168.89.133:39826 192.168.89.134:6800 >> >> ESTABLISHED 1889/cmds >> >> tcp 0 0 192.168.89.133:6789 192.168.89.134:41289 >> >> ESTABLISHED 1840/cmon >> >> tcp 0 0 192.168.89.133:6800 192.168.89.131:52814 >> >> TIME_WAIT - >> >> tcp 0 0 192.168.89.133:6789 192.168.89.135:41021 >> >> ESTABLISHED 1840/cmon >> >> tcp 0 0 192.168.89.133:42069 192.168.89.135:6800 >> >> ESTABLISHED 1889/cmds >> >> tcp 0 0 192.168.89.133:6789 192.168.89.133:43647 >> >> ESTABLISHED 1840/cmon >> >> tcp 0 0 192.168.89.133:6800 192.168.89.131:52815 >> >> TIME_WAIT - >> >> tcp 0 0 192.168.89.133:6800 192.168.89.131:52816 >> >> TIME_WAIT - >> >> tcp6 0 0 :::22 :::* >> >> LISTEN 1529/sshd >> >> udp 0 0 0.0.0.0:68 0.0.0.0:* >> >> 1490/dhclient3 >> >> Active UNIX domain sockets (servers and established) >> >> Proto RefCnt Flags Type State I-Node PID/Program >> >> name Path >> >> unix 2 [ ] DGRAM 2972 546/udevd >> >> @/org/kernel/udev/udevd >> >> unix 4 [ ] DGRAM 5343 >> >> 1358/rsyslogd /dev/log >> >> unix 2 [ ] DGRAM 5662 1530/0 >> >> unix 2 [ ] DGRAM 5486 1490/dhclient3 >> >> debian:~# >> >> debian:~# dmesg -c >> >> debian:~# (nothing shows up) >> >> >> >> I saw the port 6800 on the metadata server talking with the client is >> >> on "TIME_WAIT" stage. That means the connection is closed. >> >> This is the ceph.conf I have: >> >> [global] >> >> pid file = /var/run/ceph/$type.$id.pid >> >> [mon] >> >> mon data = /data/mon$id >> >> mon subscribe interval = 6000 >> >> mon osd down out interval = 6000 >> >> [mon0] >> >> host = mds0 >> >> mon addr = 192.168.89.133:6789 >> >> [mds] >> >> mds session timeout = 6000 >> >> mds session autoclose = 6000 >> >> mds client lease = 6000 >> >> keyring = /data/keyring.$name >> >> [mds0] >> >> host = mds0 >> >> [osd] >> >> sudo = true >> >> osd data = /data/osd$id >> >> osd journal = /journal >> >> osd journal size = 1024 >> >> filestore journal writeahead = true >> >> [osd0] >> >> host = osd0 >> >> [osd1] >> >> host = osd1 >> >> [group everyone] >> >> addr = 0.0.0.0/0 >> >> [mount] >> >> allow = %everyone >> >> ;-----------------------------------end----------------------------------- >> >> >> >> The Ceph version I was using is 0.22.1. >> >> >> >> Can anyone help me to solve this problem? Thanks in advance! >> >> -- >> >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> >> the body of a message to majordomo@xxxxxxxxxxxxxxx >> >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> >> > >> >> >> >> -- >> Yonggang Liu >> Advanced Computing and Information Systems Laboratory >> University of Florida >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majordomo@xxxxxxxxxxxxxxx >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> -- Yonggang Liu Advanced Computing and Information Systems Laboratory University of Florida -- Yonggang Liu Advanced Computing and Information Systems Laboratory University of Florida -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html