Re: some questions about ceph deployment

Sage Weil <sage@xxxxxxxxxxxx> · Wed, 22 Sep 2010 09:17:24 -0700 (PDT)

On Wed, 22 Sep 2010, cang lin wrote:
> We not only mount ceph onto a client in the same subnet but also mount it
> onto remote client through internet.in the first week everything worked
> fine,it is about 100G write operation and 10 times read operation per
> day.The file was almost read only and the size is form a dozens  of MB to a
> few GB,not a very heavy load.but in the second week the client in the same
> subnet with ceph cluster canÿÿt be accessed and ceph canÿÿt be unmounted from
> it,the remote client can still access and unmount ceph.
> 
> Use 'ceph ÿÿs' and 'ceph osd dump -0' on ceph01 can find out that the 3 of 4
> osd were down(osd0,osd02,osd04). Use 'df ÿÿh' command can find out
> /dev/sde1(for osd0), /dev/sdd1(for osd2), /dev/sdc1(for osd4) still in their
> mount point.
> 
> Use following command to restart osd:
> 
> # /etc/init.d/ceph start osd0
> 
> [/etc/ceph/fetch_config /tmp/fetched.ceph.conf.4967]
> 
> === osd.0 ===
> 
> Starting Ceph osd0 on ceph01...
> 
>  ** WARNING: Ceph is still under heavy development, and is only suitable for
> **
> 
>  **          testing and review.  Do not trust it with important data.
> **
> 
> starting osd0 at 0.0.0.0:6800/4864 osd_data /mnt/ceph/osd0/data
> /mnt/ceph/osd0/data/journal
> 
> ÿÿ
> 
> 3 osd started and ran normally,but the local ceph client was down.Dose it
> have anything to do with the osd restart?The local client can remount ceph
> after reboot and work normally. The remote client can remount ceph and work
> normally too,but a few days later it canÿÿt access or unmount ceph.
> 
> 
> 
> #umount /mnt/ceph
> 
> umount: /mnt/ceph: device is busy.
> 
>         (In some cases useful info about processes that use
> 
>          the device is found by lsof(8) or fuser(1))
> 
> 
> There was no response for lsof or fuser command.the only thing could do is
> kill the process and reboot the system.We use ceph v0.21.2 for the cluster
> and client and use Ubuntu 10.04 LTS(server),kernel version is
> 2.6.32-21-generic-paeÿÿ
> 
>  What confuse me is why the client canÿÿt access ceph?Even if the osd was
> down shouldnÿÿt affect the client.what is the reason for the client canÿÿt
> access or unmount ceph?

It could be a number of things.  The output from 

 cat /sys/kernel/debug/ceph/*/mdsc
 cat /sys/kernel/debug/ceph/*/osdc

will tell you if it's waiting for a server request to respond.  Also, if 
you know the hung pid, you can

 cat /proc/$pid/stack

and see where it is blocked.  Also,

 dmesg | tail

may have some relevant console messages.

> >          When I follow the instruction of
> > http://ceph.newdream.net/wiki/Monitor_cluster_expansion to expand a
> > monitor to ceph02, the following error occurred:
> > >
> > > root@ceph02:~#  /etc/init.d/ceph start mon1
> > > [/etc/ceph/fetch_config/tmp/fetched.ceph.conf.14210] ceph.conf 100%  2565
> >  2.5KB/s  00:00
> > > === mon.1 ===
> > > Starting Ceph mon1 on ceph02...
> > >  ** WARNING: Ceph is still under heavy development, and is only suitable
> > for **
> > >  ** testing and review.  Do not trust it with important data.  **
> > > terminate called after throwing an instance of 'std::logic_error'
> > >   what():  basic_string::_S_construct NULL not valid
> > > Aborted (core dumped)
> > > failed: ' /usr/bin/cmon -i 1 -c /tmp/fetched.ceph.conf.14210 '
> >
> > I haven't seen that crash, but it looks like a std::string constructor is
> > being passed a NULL pointer.  Do you have a core dump (to get a
> > backtrace)?  Which version are you running (`cmon -v`)?
> >
> 
> The cmon version is v0.21.1 when the crash happen and been updated to
> v0.21.2.
> 
> The following backtrace is from v0.21.2:

Thanks, we'll see if we can reproduce and fix this one!

> [...]
> Thanks,I will wait for v0.22 and try to add mds then,but I want to is my
> config for mds is right.
> 
> 
> 
> I set 2 mds in ceph.conf
> 
> [mds]
> 
> keyring = /etc/ceph/keyring.$name
> 
> debug ms = 1
> 
> [mds.ceph01]
> 
> host = ceph01
> 
> [mds.ceph02]
> 
>       host = ceph02

Looks right.

> The result for 'ceph ÿÿs':
> 
> 10.09.01_17:56:19.337895   mds e17: 1/1/1 up {0=up:active}, 1 up:standby
> 
> But now the result for 'ceph ÿÿs' is:
> 
> 10.09.19_17:01:50.398809   mds e27: 1/1/1 up {0=up:active}

It looks like the second 'standby' cmds went away.  Is the daemon still 
running?

> > > Q4.
> > > How to set the journal path to a device or patition?
> >
> >        osd journal = /dev/sdc1  ; or whatever
> >
> 
> How to know which journal is for certain osd?
> 
> Can the following config does that?
> 
> 
> 
> [osd]
> 
>         sudo = true
> 
>         osd data = /mnt/ceph/osd$id/data
> 
> [osd0]
> 
>         host = ceph01
> 
>         osd journal = /dev/sdc1
> 
> 
> If I make a partition for journal in a 500GB hdd,what is the proper size for
> the partition?

1 GB should be sufficient.

sage