Re: mds server(s) crashed

"Yan, Zheng" <ukernel@xxxxxxxxx> · Thu, 13 Aug 2015 10:21:52 +0800

On Thu, Aug 13, 2015 at 7:05 AM, Bob Ababurko <bob@xxxxxxxxxxxx> wrote:
>
> If I am using a more recent client(kernel OR ceph-fuse), should I still be
> worried about the MDS's crashing?  I have added RAM to my MDS hosts and its
> my understanding this will also help mitigate any issues, in addition to
> setting mds_bal_frag = true.  Not having used cephfs before, do I always
> need to worry about my MDS servers crashing all the time, thus the need for
> setting mds_reconnect_timeout to 0?  This is not ideal for us nor is the
> idea of clients not able to access their mounts after a MDS recovery.
>

It's unlikely this issue will happen again. But I can't  guarantee no
other issue.

no need to set mds_reconnect_timeout to 0.

> I am actually looking for the most stable way to implement cephfs at this
> point.   My cephfs cluster contains millions of small files, so many inodes
> if that needs to be taken into account.  Perhaps I should only be using one
> MDS node for stability at this point?  Is this the best way forward to get a
> handle on stability?  I'm also curious if I should I set my mds cache size
> to a number greater than files I have in the cephfs cluster?  If you can
> give some key points to configure cephfs to get the best stability and if
> possible, availability.....this would be helpful to me.

One active MDS is the most stable setup. Adding a few standby MDS
should not hurt stability.

You can't set  mds cache size to a number greater than files in the
fs, it requires lots of memory.

Yan, Zheng

>
> thanks again for the help.
>
> thanks,
> Bob
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com