Re: some questions about ceph deployment

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Sorry, I just realized this one slipped through the cracks!

On Sat, 4 Sep 2010, FWDF wrote:

> ÿÿÿÿWe use 3 servers to build a test system of ceph, configured as below:
> ÿÿÿÿ
> ÿÿÿÿHost                          IP      
> ÿÿÿÿclient01            192.168.1.10   
> ÿÿÿÿceph01              192.168.2.50
> ÿÿÿÿceph02              192.168.2.51   
> ÿÿÿÿ
> ÿÿÿÿThe OS is unbuntu 10.04 LTS and the version of ceph is v0.21.1
> ÿÿÿÿ
> ÿÿÿÿceph.conf:
> ÿÿÿÿ[global]
> ÿÿÿÿ        auth supported = cephx
> ÿÿÿÿ        pid file = /var/run/ceph/$name.pid
> ÿÿÿÿ        debug ms = 0
> ÿÿÿÿ        keyring = /etc/ceph/keyring.bin
> ÿÿÿÿ [mon]
> ÿÿÿÿ        mon data = /mnt/ceph/data/mon$id
> ÿÿÿÿ        debug ms = 1
> ÿÿÿÿ[mon0]
> ÿÿÿÿ        host = ceph01
> ÿÿÿÿ        mon addr = 192.168.2.50:6789
> ÿÿÿÿ [mds]
> ÿÿÿÿ        keyring = /etc/ceph/keyring.$name
> debug ms = 1
> ÿÿÿÿ[mds.ceph01]
> ÿÿÿÿ        host = ceph01
> ÿÿÿÿ[mds.ceph02]
> ÿÿÿÿ        host = ceph02
> ÿÿÿÿ[osd]
> ÿÿÿÿ        sudo = true
> ÿÿÿÿ        osd data = /mnt/ceph/osd$id/data
> ÿÿÿÿ        keyring = /etc/ceph/keyring.$name
> ÿÿÿÿ        osd journal = /mnt/ceph/osd$id/data/journal
> ÿÿÿÿ        osd journal size = 100
> ÿÿÿÿ[osd0]
> ÿÿÿÿ        host = ceph01
> ÿÿÿÿ[osd1]
> ÿÿÿÿ        host = ceph01
> ÿÿÿÿ[osd2]
> ÿÿÿÿ        host = ceph01
> ÿÿÿÿ[osd3]
> ÿÿÿÿ        host = ceph01
> ÿÿÿÿ[osd10]
> ÿÿÿÿ        host = ceph02
> ÿÿÿÿ
> ÿÿÿÿThere are 4 HDDs in the ceph01 and every HDD has a OSD named as osd0, osd1, osd2,osd3; there is 1 HDD in the ceph02 named as osd10. All these HDDs are made as btrfs and mounted on the mount point as listed below:
> ÿÿÿÿ
> ÿÿÿÿceph01
> ÿÿÿÿ         /dev/sdc1         /mnt/ceph/osd0/data               btrfs
> ÿÿÿÿ         /dev/sdd1         /mnt/ceph/osd1/data               btrfs
> ÿÿÿÿ         /dev/sde1         /mnt/ceph/osd2/data               btrfs
> ÿÿÿÿ         /dev/sdf1          /mnt/ceph/osd3/data               btrfs
> ÿÿÿÿ
> ÿÿÿÿceph02
> ÿÿÿÿ         /dev/sdb1         /mnt.ceph/osd10/data             btrfs
> ÿÿÿÿ
> ÿÿÿÿMake ceph FileSystem:
> ÿÿÿÿroot@ceph01:~#  mkcephfs  -c /etc/cepf/ceph.conf -a -k /etc/ceph/keyring.bin
> ÿÿÿÿ
> ÿÿÿÿStartup ceph:
> ÿÿÿÿroot@ceph01:~#  /etc/init.d/ceph ÿÿa  start
> 
>          Then
> ÿÿÿÿroot@ceph01:~#  ceph -w
> ÿÿÿÿ10.09.01_17:56:19.337895   mds e17: 1/1/1 up {0=up:active}, 1 up:standby
> ÿÿÿÿ10.09.01_17:56:19.347184   osd e27: 5 osds: 5 up, 5 in
> ÿÿÿÿ10.09.01_17:56:19.349447     log ÿÿ 
> ÿÿÿÿ10.09.01_17:56:19.373773   mon e1: 1 mons at 192.168.2.50:6789/0
> ÿÿÿÿ
> ÿÿÿÿThe ceph file system is mounted to client01(192.168.1.10), ceph01(192.168.2.50), ceph02ÿÿ192.168.2.51ÿÿat /data/ceph. It works fine at the beginning, I can use ls and the write and read of file is ok. After some files are wrote , I find I canÿÿt use ls ÿÿl /data/ceph until I umount ceph from ceph02, but one day later the same problem occurred again, then I umount ceph from ceph01 the system and everything is ok.
> 
> ÿÿÿÿQ1:
>          Can the ceph filesystem be mounted to a member of ceph cluster?

Technically, yes, but you should be very careful doing so.  The problem is 
that when the kernel is low on memory it will force the client to write 
out dirty data so that it can reclaim those pages.  If the writeout 
depends on then waking up some user process (cosd daemon), doing a bunch 
of random work, and writing the data to disk (dirtying yet more memory), 
you can deadlock the system.
 
>          When I follow the instruction of http://ceph.newdream.net/wiki/Monitor_cluster_expansion to expand a monitor to ceph02, the following error occurred:
> ÿÿÿÿ
> ÿÿÿÿroot@ceph02:~#  /etc/init.d/ceph start mon1
> ÿÿÿÿ[/etc/ceph/fetch_config/tmp/fetched.ceph.conf.14210] ceph.conf 100%  2565  2.5KB/s  00:00 
> ÿÿÿÿ=== mon.1 ===
> ÿÿÿÿStarting Ceph mon1 on ceph02...
> ÿÿÿÿ ** WARNING: Ceph is still under heavy development, and is only suitable for **
> ÿÿÿÿ ** testing and review.  Do not trust it with important data.  **
> ÿÿÿÿterminate called after throwing an instance of 'std::logic_error'
> ÿÿÿÿ  what():  basic_string::_S_construct NULL not valid
> ÿÿÿÿAborted (core dumped)
> ÿÿÿÿfailed: ' /usr/bin/cmon -i 1 -c /tmp/fetched.ceph.conf.14210 '

I haven't seen that crash, but it looks like a std::string constructor is 
being passed a NULL pointer.  Do you have a core dump (to get a 
backtrace)?  Which version are you running (`cmon -v`)?

> ÿÿÿÿQ2:
> ÿÿÿÿHow to expand a monitor to a running ceph system?

The process in that wiki article can expand the monitor cluster while it 
is online.  Note that the monitor identication changed slightly between 
v0.21 and the current unstable branch (will be v0.22), and the 
instructions still need to be updated for that.

> ÿÿÿÿQ3
> ÿÿÿÿ    Is it possible to add mds when the ceph system is running? how?

Yes.  Add the new mds to ceph.conf, start the daemon.  You should see it 
as up:standby in the 'ceph -s' or 'ceph mds dump -o -' output.  Then

 ceph mds setmaxmds 2

change the size of the 'active' cluster to 2.  

Please keep in mind the clustered MDS still has some bugs; we expect v0.22 
to be stable.

> ÿÿÿÿ
> ÿÿÿÿI fdisked a HDD into two partition, one for journal, other one for data like this:
> ÿÿÿÿ/dev/sdc1ÿÿ180GBÿÿas data
> ÿÿÿÿ/dev/sdc2ÿÿ10GBÿÿas journal
> ÿÿÿÿ
> ÿÿÿÿ/dev/sdc1 made as btrfs, mount to /mnt/osd0/data
> ÿÿÿÿ/dev/sdc2 made as btrfs, mount to /mnt/osd0/journal
> ÿÿÿÿ
> ÿÿÿÿceph.conf:
> ÿÿÿÿÿÿ
> ÿÿÿÿ[osd]
> ÿÿÿÿ        osd data = /mnt/ceph/osd$id/data
> ÿÿÿÿ        osd journal = /mnt/ceph/osd$id/journal
> ÿÿÿÿ        ; osd journal size = 100
> ÿÿÿÿÿÿ
> ÿÿÿÿWhen I use mkcephfs command, I can't build osd until I edited ceph.conf like this:
> ÿÿÿÿ
> ÿÿÿÿ[osd]
> ÿÿÿÿ        osd data = /mnt/ceph/osd$id/data
> ÿÿÿÿ        osd journal = /mnt/ceph/osd$id/data/journal
> ÿÿÿÿ        osd journal size = 100

If the journal is a file, the system won't create it for you unless you 
specify a size.  If it already exists (e.g., you created it via 'dd', or 
it's a block device) the journal size isn't needed.

> ÿÿÿÿQ4.
> ÿÿÿÿ  How to set the journal path to a device or patition?

	osd journal = /dev/sdc1  ; or whatever

Hope this helps!  Sorry for the slow response.  Let us know if you have 
further questions!

sage


> ÿÿÿÿThanks for all help and reply , sorry for my lame English.
> ÿÿÿÿ
> ÿÿÿÿLin
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux