Re: some questions about ceph deployment

cang lin <fwdflywl@xxxxxxxxx> · Thu, 23 Sep 2010 01:57:01 +0800

2010/9/23 Sage Weil <sage@xxxxxxxxxxxx>:
> On Wed, 22 Sep 2010, cang lin wrote:
>>  What confuse me is why the client can't access ceph?Even if the osd was
>> down shouldn't affect the client.what is the reason for the client canÿÿt
>> access or unmount ceph?
>
> It could be a number of things.  The output from
>
>  cat /sys/kernel/debug/ceph/*/mdsc
>  cat /sys/kernel/debug/ceph/*/osdc
>
> will tell you if it's waiting for a server request to respond.  Also, if
> you know the hung pid, you can
>
>  cat /proc/$pid/stack
>
> and see where it is blocked.  Also,
>
>  dmesg | tail
>
> may have some relevant console messages.
>
>
>> >          When I follow the instruction of
>> > http://ceph.newdream.net/wiki/Monitor_cluster_expansion to expand a
>> > monitor to ceph02, the following error occurred:
>> > >
>> > > root@ceph02:~#  /etc/init.d/ceph start mon1
>> > > [/etc/ceph/fetch_config/tmp/fetched.ceph.conf.14210] ceph.conf 100%  2565
>> >  2.5KB/s  00:00
>> > > === mon.1 ===
>> > > Starting Ceph mon1 on ceph02...
>> > >  ** WARNING: Ceph is still under heavy development, and is only suitable
>> > for **
>> > >  ** testing and review.  Do not trust it with important data.  **
>> > > terminate called after throwing an instance of 'std::logic_error'
>> > >   what():  basic_string::_S_construct NULL not valid
>> > > Aborted (core dumped)
>> > > failed: ' /usr/bin/cmon -i 1 -c /tmp/fetched.ceph.conf.14210 '
>> >
>> > I haven't seen that crash, but it looks like a std::string constructor is
>> > being passed a NULL pointer.  Do you have a core dump (to get a
>> > backtrace)?  Which version are you running (`cmon -v`)?
>> >
>>
>> The cmon version is v0.21.1 when the crash happen and been updated to
>> v0.21.2.
>>
>> The following backtrace is from v0.21.2:
>
> Thanks, we'll see if we can reproduce and fix this one!
>
>> [...]
>> Thanks,I will wait for v0.22 and try to add mds then,but I want to is my
>> config for mds is right.
>>
>>
>>
>> I set 2 mds in ceph.conf
>>
>> [mds]
>>
>> keyring = /etc/ceph/keyring.$name
>>
>> debug ms = 1
>>
>> [mds.ceph01]
>>
>>      host = ceph01
>>
>> [mds.ceph02]
>>
>>       host = ceph02
>
> Looks right.
>
>
>> The result for 'ceph -s':
>>
>> 10.09.01_17:56:19.337895   mds e17: 1/1/1 up {0=up:active}, 1 up:standby
>>
>> But now the result for 'ceph -s' is:
>>
>> 10.09.19_17:01:50.398809   mds e27: 1/1/1 up {0=up:active}
>
> It looks like the second 'standby' cmds went away.  Is the daemon still
> running?
>

I don't know if it was still running because both mds were down now.

>>
>>
>> If I make a partition for journal in a 500GB hdd,what is the proper size for
>> the partition?
>
> 1 GB should be sufficient.
>
> sage

thanks!
Lin
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html