Sorry, I just realized this one slipped through the cracks! On Sat, 4 Sep 2010, FWDF wrote: > ÿÿÿÿWe use 3 servers to build a test system of ceph, configured as below: > ÿÿÿÿ > ÿÿÿÿHost IP > ÿÿÿÿclient01 192.168.1.10 > ÿÿÿÿceph01 192.168.2.50 > ÿÿÿÿceph02 192.168.2.51 > ÿÿÿÿ > ÿÿÿÿThe OS is unbuntu 10.04 LTS and the version of ceph is v0.21.1 > ÿÿÿÿ > ÿÿÿÿceph.conf: > ÿÿÿÿ[global] > ÿÿÿÿ auth supported = cephx > ÿÿÿÿ pid file = /var/run/ceph/$name.pid > ÿÿÿÿ debug ms = 0 > ÿÿÿÿ keyring = /etc/ceph/keyring.bin > ÿÿÿÿ [mon] > ÿÿÿÿ mon data = /mnt/ceph/data/mon$id > ÿÿÿÿ debug ms = 1 > ÿÿÿÿ[mon0] > ÿÿÿÿ host = ceph01 > ÿÿÿÿ mon addr = 192.168.2.50:6789 > ÿÿÿÿ [mds] > ÿÿÿÿ keyring = /etc/ceph/keyring.$name > debug ms = 1 > ÿÿÿÿ[mds.ceph01] > ÿÿÿÿ host = ceph01 > ÿÿÿÿ[mds.ceph02] > ÿÿÿÿ host = ceph02 > ÿÿÿÿ[osd] > ÿÿÿÿ sudo = true > ÿÿÿÿ osd data = /mnt/ceph/osd$id/data > ÿÿÿÿ keyring = /etc/ceph/keyring.$name > ÿÿÿÿ osd journal = /mnt/ceph/osd$id/data/journal > ÿÿÿÿ osd journal size = 100 > ÿÿÿÿ[osd0] > ÿÿÿÿ host = ceph01 > ÿÿÿÿ[osd1] > ÿÿÿÿ host = ceph01 > ÿÿÿÿ[osd2] > ÿÿÿÿ host = ceph01 > ÿÿÿÿ[osd3] > ÿÿÿÿ host = ceph01 > ÿÿÿÿ[osd10] > ÿÿÿÿ host = ceph02 > ÿÿÿÿ > ÿÿÿÿThere are 4 HDDs in the ceph01 and every HDD has a OSD named as osd0, osd1, osd2,osd3; there is 1 HDD in the ceph02 named as osd10. All these HDDs are made as btrfs and mounted on the mount point as listed below: > ÿÿÿÿ > ÿÿÿÿceph01 > ÿÿÿÿ /dev/sdc1 /mnt/ceph/osd0/data btrfs > ÿÿÿÿ /dev/sdd1 /mnt/ceph/osd1/data btrfs > ÿÿÿÿ /dev/sde1 /mnt/ceph/osd2/data btrfs > ÿÿÿÿ /dev/sdf1 /mnt/ceph/osd3/data btrfs > ÿÿÿÿ > ÿÿÿÿceph02 > ÿÿÿÿ /dev/sdb1 /mnt.ceph/osd10/data btrfs > ÿÿÿÿ > ÿÿÿÿMake ceph FileSystem: > ÿÿÿÿroot@ceph01:~# mkcephfs -c /etc/cepf/ceph.conf -a -k /etc/ceph/keyring.bin > ÿÿÿÿ > ÿÿÿÿStartup ceph: > ÿÿÿÿroot@ceph01:~# /etc/init.d/ceph ÿÿa start > > Then > ÿÿÿÿroot@ceph01:~# ceph -w > ÿÿÿÿ10.09.01_17:56:19.337895 mds e17: 1/1/1 up {0=up:active}, 1 up:standby > ÿÿÿÿ10.09.01_17:56:19.347184 osd e27: 5 osds: 5 up, 5 in > ÿÿÿÿ10.09.01_17:56:19.349447 log ÿÿ > ÿÿÿÿ10.09.01_17:56:19.373773 mon e1: 1 mons at 192.168.2.50:6789/0 > ÿÿÿÿ > ÿÿÿÿThe ceph file system is mounted to client01(192.168.1.10), ceph01(192.168.2.50), ceph02ÿÿ192.168.2.51ÿÿat /data/ceph. It works fine at the beginning, I can use ls and the write and read of file is ok. After some files are wrote , I find I canÿÿt use ls ÿÿl /data/ceph until I umount ceph from ceph02, but one day later the same problem occurred again, then I umount ceph from ceph01 the system and everything is ok. > > ÿÿÿÿQ1: > Can the ceph filesystem be mounted to a member of ceph cluster? Technically, yes, but you should be very careful doing so. The problem is that when the kernel is low on memory it will force the client to write out dirty data so that it can reclaim those pages. If the writeout depends on then waking up some user process (cosd daemon), doing a bunch of random work, and writing the data to disk (dirtying yet more memory), you can deadlock the system. > When I follow the instruction of http://ceph.newdream.net/wiki/Monitor_cluster_expansion to expand a monitor to ceph02, the following error occurred: > ÿÿÿÿ > ÿÿÿÿroot@ceph02:~# /etc/init.d/ceph start mon1 > ÿÿÿÿ[/etc/ceph/fetch_config/tmp/fetched.ceph.conf.14210] ceph.conf 100% 2565 2.5KB/s 00:00 > ÿÿÿÿ=== mon.1 === > ÿÿÿÿStarting Ceph mon1 on ceph02... > ÿÿÿÿ ** WARNING: Ceph is still under heavy development, and is only suitable for ** > ÿÿÿÿ ** testing and review. Do not trust it with important data. ** > ÿÿÿÿterminate called after throwing an instance of 'std::logic_error' > ÿÿÿÿ what(): basic_string::_S_construct NULL not valid > ÿÿÿÿAborted (core dumped) > ÿÿÿÿfailed: ' /usr/bin/cmon -i 1 -c /tmp/fetched.ceph.conf.14210 ' I haven't seen that crash, but it looks like a std::string constructor is being passed a NULL pointer. Do you have a core dump (to get a backtrace)? Which version are you running (`cmon -v`)? > ÿÿÿÿQ2: > ÿÿÿÿHow to expand a monitor to a running ceph system? The process in that wiki article can expand the monitor cluster while it is online. Note that the monitor identication changed slightly between v0.21 and the current unstable branch (will be v0.22), and the instructions still need to be updated for that. > ÿÿÿÿQ3 > ÿÿÿÿ Is it possible to add mds when the ceph system is running? how? Yes. Add the new mds to ceph.conf, start the daemon. You should see it as up:standby in the 'ceph -s' or 'ceph mds dump -o -' output. Then ceph mds setmaxmds 2 change the size of the 'active' cluster to 2. Please keep in mind the clustered MDS still has some bugs; we expect v0.22 to be stable. > ÿÿÿÿ > ÿÿÿÿI fdisked a HDD into two partition, one for journal, other one for data like this: > ÿÿÿÿ/dev/sdc1ÿÿ180GBÿÿas data > ÿÿÿÿ/dev/sdc2ÿÿ10GBÿÿas journal > ÿÿÿÿ > ÿÿÿÿ/dev/sdc1 made as btrfs, mount to /mnt/osd0/data > ÿÿÿÿ/dev/sdc2 made as btrfs, mount to /mnt/osd0/journal > ÿÿÿÿ > ÿÿÿÿceph.conf: > ÿÿÿÿÿÿ > ÿÿÿÿ[osd] > ÿÿÿÿ osd data = /mnt/ceph/osd$id/data > ÿÿÿÿ osd journal = /mnt/ceph/osd$id/journal > ÿÿÿÿ ; osd journal size = 100 > ÿÿÿÿÿÿ > ÿÿÿÿWhen I use mkcephfs command, I can't build osd until I edited ceph.conf like this: > ÿÿÿÿ > ÿÿÿÿ[osd] > ÿÿÿÿ osd data = /mnt/ceph/osd$id/data > ÿÿÿÿ osd journal = /mnt/ceph/osd$id/data/journal > ÿÿÿÿ osd journal size = 100 If the journal is a file, the system won't create it for you unless you specify a size. If it already exists (e.g., you created it via 'dd', or it's a block device) the journal size isn't needed. > ÿÿÿÿQ4. > ÿÿÿÿ How to set the journal path to a device or patition? osd journal = /dev/sdc1 ; or whatever Hope this helps! Sorry for the slow response. Let us know if you have further questions! sage > ÿÿÿÿThanks for all help and reply , sorry for my lame English. > ÿÿÿÿ > ÿÿÿÿLin > > > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > >