On Mon, Feb 28, 2011 at 4:00 AM, Upendra Moturi <upendra.m@xxxxxxxxxxxx> wrote: > Hi Colin > > /var/run/ceph exist on only one node(the node on which i start the > cluster with -a) Without /var/run/ceph/pid, the init script will not know which pid to kill. So it will do nothing. Try creating /var/run/ceph and other appropriate directories on all nodes before starting the cluster. Colin > > On Sat, Feb 26, 2011 at 5:58 AM, Colin McCabe <cmccabe@xxxxxxxxxxxxxx> wrote: >> Hi Upendra, >> >> Based on the output you posted, init-ceph is doing something on every >> node. However, I only see a kill for certain nodes. >> >> Does /var/run/ceph/ exist on all nodes, or just some of them? Does the >> appropriate pid file exist on all nodes? What happens when you ssh in >> to those nodes manually and run init-ceph stop? >> >> Colin >> >> >> On Fri, Feb 18, 2011 at 6:11 AM, Upendra Moturi <upendra.m@xxxxxxxxxxxx> wrote: >>> Hi Colin >>> >>> I am using the ubuntu 11.04 (32 bit) and got the ceph package from apt-get >>> i am using the default init script.(Found at /etc/init.d/ceph) >>> >>> Regarding issue 1) >>> >>> On the osd node I tried to start that osd but it did not work >>> Steps followed >>> >>> 1) Started 3 nodes(ceph.conf is same as i sent earlier) >>> >>> 2) mkcephfs -c /etc/ceph/ceph.conf -a --mkbtrfs -k /etc/ceph/keyring.bin >>> >>> 3)/etc/init.d/ceph start osd0 (Did this on first osd) >>> >>> 4) ps -ef | grep ceph or ps -ef | grep cosd ---- does not show any process >>> >>> 5)/etc/init.d/ceph -a start ---- shows all process on all nodes >>> >>> Regading Issue 2 >>> It says -x option is not available >>> >>> Tried /etc/init.d/ceph -ax stop and /etc/init.d/ceph -x stop and even >>> tried /etc/init.d/ceph -x -a stop >>> but nothing worked. >>> >>> Then tried with /etc/init.d/ceph -a -v stop this also did not stop >>> ceph on all nodes but got the output as >>> >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 0 -t mon "auto start" "" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 0 -t mon "user" "" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 0 -t mon "ssh path" "/etc/ceph" >>> === mon.0 === >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 0 -t mon "pid file" >>> "/var/run/ceph/mon.0.pid" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 0 -t mon "log dir" "/var/log/ceph" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 0 -t mon "log sym dir" "" >>> --- ssh ceph0 "cd /etc/ceph ; ulimit -c unlimited ; mkdir -p /var/log/ceph" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 0 -t mon "lock file" >>> "/var/lock/subsys/ceph" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 0 -t mon "pre stop command" "" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 0 -t mon "post stop command" "" >>> Stopping Ceph mon.0 on ceph0...--- ssh ceph0 "cd /etc/ceph ; ulimit >>> -c unlimited ; while [ 1 ]; do >>> [ -e /var/run/ceph/mon.0.pid ] || break >>> pid=`cat /var/run/ceph/mon.0.pid` >>> while [ -e /proc/$pid ] && grep -q cmon /proc/$pid/cmdline ; do >>> cmd="kill $pid" >>> echo -n $cmd... >>> $cmd >>> sleep 1 >>> continue >>> done >>> break >>> done" >>> kill 1668...done >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 1 -t mon "auto start" "" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 1 -t mon "user" "" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 1 -t mon "ssh path" "/etc/ceph" >>> === mon.1 === >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 1 -t mon "pid file" >>> "/var/run/ceph/mon.1.pid" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 1 -t mon "log dir" "/var/log/ceph" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 1 -t mon "log sym dir" "" >>> --- ssh ceph1 "cd /etc/ceph ; ulimit -c unlimited ; mkdir -p /var/log/ceph" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 1 -t mon "lock file" >>> "/var/lock/subsys/ceph" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 1 -t mon "pre stop command" "" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 1 -t mon "post stop command" "" >>> Stopping Ceph mon.1 on ceph1...--- ssh ceph1 "cd /etc/ceph ; ulimit >>> -c unlimited ; while [ 1 ]; do >>> [ -e /var/run/ceph/mon.1.pid ] || break >>> pid=`cat /var/run/ceph/mon.1.pid` >>> while [ -e /proc/$pid ] && grep -q cmon /proc/$pid/cmdline ; do >>> cmd="kill $pid" >>> echo -n $cmd... >>> $cmd >>> sleep 1 >>> continue >>> done >>> break >>> done" >>> done >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 2 -t mon "auto start" "" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 2 -t mon "user" "" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 2 -t mon "ssh path" "/etc/ceph" >>> === mon.2 === >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 2 -t mon "pid file" >>> "/var/run/ceph/mon.2.pid" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 2 -t mon "log dir" "/var/log/ceph" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 2 -t mon "log sym dir" "" >>> --- ssh ceph2 "cd /etc/ceph ; ulimit -c unlimited ; mkdir -p /var/log/ceph" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 2 -t mon "lock file" >>> "/var/lock/subsys/ceph" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 2 -t mon "pre stop command" "" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 2 -t mon "post stop command" "" >>> Stopping Ceph mon.2 on ceph2...--- ssh ceph2 "cd /etc/ceph ; ulimit >>> -c unlimited ; while [ 1 ]; do >>> [ -e /var/run/ceph/mon.2.pid ] || break >>> pid=`cat /var/run/ceph/mon.2.pid` >>> while [ -e /proc/$pid ] && grep -q cmon /proc/$pid/cmdline ; do >>> cmd="kill $pid" >>> echo -n $cmd... >>> $cmd >>> sleep 1 >>> continue >>> done >>> break >>> done" >>> done >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 0 -t mds "auto start" "" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 0 -t mds "user" "" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 0 -t mds "ssh path" "/etc/ceph" >>> === mds.0 === >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 0 -t mds "pid file" >>> "/var/run/ceph/mds.0.pid" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 0 -t mds "log dir" "/var/log/ceph" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 0 -t mds "log sym dir" "" >>> --- ssh ceph0 "cd /etc/ceph ; ulimit -c unlimited ; mkdir -p /var/log/ceph" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 0 -t mds "lock file" >>> "/var/lock/subsys/ceph" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 0 -t mds "pre stop command" "" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 0 -t mds "post stop command" "" >>> Stopping Ceph mds0 on ceph0...--- ssh ceph0 "cd /etc/ceph ; ulimit -c >>> unlimited ; while [ 1 ]; do >>> [ -e /var/run/ceph/mds.0.pid ] || break >>> pid=`cat /var/run/ceph/mds.0.pid` >>> while [ -e /proc/$pid ] && grep -q cmds /proc/$pid/cmdline ; do >>> cmd="kill $pid" >>> echo -n $cmd... >>> $cmd >>> sleep 1 >>> continue >>> done >>> break >>> done" >>> kill 1844...done >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 1 -t mds "auto start" "" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 1 -t mds "user" "" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 1 -t mds "ssh path" "/etc/ceph" >>> === mds.1 === >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 1 -t mds "pid file" >>> "/var/run/ceph/mds.1.pid" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 1 -t mds "log dir" "/var/log/ceph" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 1 -t mds "log sym dir" "" >>> --- ssh ceph1 "cd /etc/ceph ; ulimit -c unlimited ; mkdir -p /var/log/ceph" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 1 -t mds "lock file" >>> "/var/lock/subsys/ceph" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 1 -t mds "pre stop command" "" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 1 -t mds "post stop command" "" >>> Stopping Ceph mds1 on ceph1...--- ssh ceph1 "cd /etc/ceph ; ulimit -c >>> unlimited ; while [ 1 ]; do >>> [ -e /var/run/ceph/mds.1.pid ] || break >>> pid=`cat /var/run/ceph/mds.1.pid` >>> while [ -e /proc/$pid ] && grep -q cmds /proc/$pid/cmdline ; do >>> cmd="kill $pid" >>> echo -n $cmd... >>> $cmd >>> sleep 1 >>> continue >>> done >>> break >>> done" >>> done >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 0 -t osd "auto start" "" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 0 -t osd "user" "" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 0 -t osd "ssh path" "/etc/ceph" >>> === osd.0 === >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 0 -t osd "pid file" >>> "/var/run/ceph/osd.0.pid" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 0 -t osd "log dir" "/var/log/ceph" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 0 -t osd "log sym dir" "" >>> --- ssh ceph0 "cd /etc/ceph ; ulimit -c unlimited ; mkdir -p /var/log/ceph" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 0 -t osd "osd data" "" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 0 -t osd "btrfs path" "/data/osd0" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 0 -t osd "btrfs devs" "" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 0 -t osd "lock file" >>> "/var/lock/subsys/ceph" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 0 -t osd "pre stop command" "" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 0 -t osd "post stop command" "" >>> Stopping Ceph osd0 on ceph0...--- ssh ceph0 "cd /etc/ceph ; ulimit -c >>> unlimited ; while [ 1 ]; do >>> [ -e /var/run/ceph/osd.0.pid ] || break >>> pid=`cat /var/run/ceph/osd.0.pid` >>> while [ -e /proc/$pid ] && grep -q cosd /proc/$pid/cmdline ; do >>> cmd="kill $pid" >>> echo -n $cmd... >>> $cmd >>> sleep 1 >>> continue >>> done >>> break >>> done" >>> kill 2033...done >>> Unmounting Btrfs on ceph0:/data/osd0 >>> --- ssh root@ceph0 "cd /etc/ceph ; ulimit -c unlimited ; umount >>> /data/osd0 || true" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 1 -t osd "auto start" "" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 1 -t osd "user" "" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 1 -t osd "ssh path" "/etc/ceph" >>> === osd.1 === >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 1 -t osd "pid file" >>> "/var/run/ceph/osd.1.pid" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 1 -t osd "log dir" "/var/log/ceph" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 1 -t osd "log sym dir" "" >>> --- ssh ceph1 "cd /etc/ceph ; ulimit -c unlimited ; mkdir -p /var/log/ceph" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 1 -t osd "osd data" "" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 1 -t osd "btrfs path" "/data/osd1" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 1 -t osd "btrfs devs" "" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 1 -t osd "lock file" >>> "/var/lock/subsys/ceph" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 1 -t osd "pre stop command" "" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 1 -t osd "post stop command" "" >>> Stopping Ceph osd1 on ceph1...--- ssh ceph1 "cd /etc/ceph ; ulimit -c >>> unlimited ; while [ 1 ]; do >>> [ -e /var/run/ceph/osd.1.pid ] || break >>> pid=`cat /var/run/ceph/osd.1.pid` >>> while [ -e /proc/$pid ] && grep -q cosd /proc/$pid/cmdline ; do >>> cmd="kill $pid" >>> echo -n $cmd... >>> $cmd >>> sleep 1 >>> continue >>> done >>> break >>> done" >>> done >>> Unmounting Btrfs on ceph1:/data/osd1 >>> --- ssh root@ceph1 "cd /etc/ceph ; ulimit -c unlimited ; umount >>> /data/osd1 || true" >>> umount: /data/osd1: device is busy. >>> (In some cases useful info about processes that use >>> the device is found by lsof(8) or fuser(1)) >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 2 -t osd "auto start" "" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 2 -t osd "user" "" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 2 -t osd "ssh path" "/etc/ceph" >>> === osd.2 === >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 2 -t osd "pid file" >>> "/var/run/ceph/osd.2.pid" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 2 -t osd "log dir" "/var/log/ceph" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 2 -t osd "log sym dir" "" >>> --- ssh ceph2 "cd /etc/ceph ; ulimit -c unlimited ; mkdir -p /var/log/ceph" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 2 -t osd "osd data" "" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 2 -t osd "btrfs path" "/data/osd2" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 2 -t osd "btrfs devs" "" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 2 -t osd "lock file" >>> "/var/lock/subsys/ceph" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 2 -t osd "pre stop command" "" >>> /usr/bin/cconf -c /etc/ceph/ceph.conf -i 2 -t osd "post stop command" "" >>> Stopping Ceph osd2 on ceph2...--- ssh ceph2 "cd /etc/ceph ; ulimit -c >>> unlimited ; while [ 1 ]; do >>> [ -e /var/run/ceph/osd.2.pid ] || break >>> pid=`cat /var/run/ceph/osd.2.pid` >>> while [ -e /proc/$pid ] && grep -q cosd /proc/$pid/cmdline ; do >>> cmd="kill $pid" >>> echo -n $cmd... >>> $cmd >>> sleep 1 >>> continue >>> done >>> break >>> done" >>> done >>> Unmounting Btrfs on ceph2:/data/osd2 >>> --- ssh root@ceph2 "cd /etc/ceph ; ulimit -c unlimited ; umount >>> /data/osd2 || true" >>> umount: /data/osd2: device is busy. >>> (In some cases useful info about processes that use >>> the device is found by lsof(8) or fuser(1)) >>> >>> >>> >>> >>> On Fri, Feb 18, 2011 at 12:35 PM, Colin McCabe <cmccabe@xxxxxxxxxxxxxx> wrote: >>>> Hi Upendra, >>>> >>>> Are you running init-ceph from the source directory? If you do that, >>>> it will use the ceph.conf in the source directory itself, which is >>>> probably not what you want. So it might be good to double-check that. >>>> >>>> If all else fails, running init-ceph with -x will show you exactly >>>> what the script is doing. If all goes well, its exit status should be >>>> 0. Are you getting exit status 0? >>>> >>>> Colin >>>> >>>> >>>> On Thu, Feb 17, 2011 at 1:55 AM, Upendra Moturi <upendra.m@xxxxxxxxxxxx> wrote: >>>>> Hi Colin >>>>> >>>>> Here is my ceph.conf : >>>>> >>>>> [global] >>>>> pid file = /var/run/ceph/$name.pid >>>>> debug ms = 1 >>>>> [mon] >>>>> mon data = /data/mon$id >>>>> [mon.0] >>>>> host = ceph0 >>>>> mon addr = 192.168.155.5:6789 >>>>> [mon.1] >>>>> host = ceph1 >>>>> mon addr = 192.168.155.6:6789 >>>>> [mon.2] >>>>> host = ceph2 >>>>> mon addr = 192.168.155.7:6789 >>>>> [mds] >>>>> >>>>> [mds0] >>>>> host = ceph0 >>>>> [mds1] >>>>> host = ceph1 >>>>> >>>>> [osd] >>>>> sudo = true >>>>> osd data = /data/osd$id >>>>> osd journal = /data/osd$id/journal >>>>> osd journal size = 512 >>>>> osd use stale snap = true >>>>> [osd0] >>>>> host = ceph0 >>>>> btrfs devs = /dev/sdb >>>>> [osd1] >>>>> host = ceph1 >>>>> btrfs devs = /dev/sdb >>>>> [osd2] >>>>> host = ceph2 >>>>> btrfs devs = /dev/sdb >>>>> >>>>> >>>>> >>>>> >>>>> On Thu, Feb 17, 2011 at 1:06 PM, Colin McCabe <cmccabe@xxxxxxxxxxxxxx> wrote: >>>>>> I'm using head of line from the master branch. But that particular >>>>>> code hasn't changed since January, which is 0.24.2 is from. >>>>>> >>>>>> In my ceph.conf, I just had an osd that was on a remote machine, and >>>>>> everything else local. >>>>>> >>>>>> If you could post your ceph.conf here or in IRC, perhaps we might spot >>>>>> an issue that's causing the problems that you see. >>>>>> >>>>>> Colin >>>>>> >>>>>> >>>>>> On Wed, Feb 16, 2011 at 11:01 PM, Upendra Moturi <upendra.m@xxxxxxxxxxxx> wrote: >>>>>>> Hi Colin >>>>>>> I am using >>>>>>> ceph version 0.24.2 commit:f7572de5cb87eb7157217be4975ae66d90831bb7 >>>>>>> ubuntu 11.04 32 bit with upgraded kernal of 2.6.38-2-generic >>>>>>> >>>>>>> Installed ceph form apt source. >>>>>>> >>>>>>> With above configurations i still able yo reproduce. >>>>>>> Can you please share me ur configurations? >>>>>>> >>>>>>> On Thu, Feb 17, 2011 at 3:31 AM, Colin McCabe <cmccabe@xxxxxxxxxxxxxx> wrote: >>>>>>>> On Wed, Feb 16, 2011 at 1:41 PM, Colin McCabe <cmccabe@xxxxxxxxxxxxxx> wrote: >>>>>>>>> On Wed, Feb 16, 2011 at 6:44 AM, Upendra Moturi <upendra.m@xxxxxxxxxxxx> wrote: >>>>>>>>>> But if we want to start a particular osd or mon or mds ,its not >>>>>>>>>> working and there is no error >>>>>>>>>> eg:/etc/init.d/ceph start osd1 does not start osd1 and don't get any error >>>>>>>>> >>>>>>>>> That is expected, unless you are running init-ceph on the same node as >>>>>>>>> osd1 is on. >>>>>>>>> >>>>>>>>> It might be nice to have some kind of interface like "run command X on >>>>>>>>> osd1", but init-ceph is not that. >>>>>>>>> >>>>>>>>>> /etc/init.d/ceph -a stop also does not stop ceph on all nodes.It stops >>>>>>>>>> on current node only >>>>>>>>>> where as >>>>>>>>>> /etc/init.d/ceph -a killall works fine. >>>>>>>>> >>>>>>>>> That sounds like a bug. I'll see if I can fix it. >>>>>>>> >>>>>>>> I'm afraid I can't reproduce this. >>>>>>>> >>>>>>>> I ran /etc/init.d/ceph -a stop >>>>>>>> >>>>>>>> and it stopped ceph daemons running on remote nodes too. Looking at >>>>>>>> the code, it looks correct. >>>>>>>> >>>>>>>> Colin >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Thanks and Regards, >>>>>>> Upendra.M >>>>>>> >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Thanks and Regards, >>>>> Upendra.M >>>>> >>>> >>> >>> >>> >>> -- >>> Thanks and Regards, >>> Upendra.M >>> >> > > > > -- > Thanks and Regards, > Upendra.M > -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html