Re: Ceph User Teething Problems

Robert LeBlanc <robert@xxxxxxxxxxxxx> · Wed, 4 Mar 2015 13:37:51 -0700

I can't help much on the MDS front, but here is some answers and my
view on some of it.

On Wed, Mar 4, 2015 at 1:27 PM, Datatone Lists <lists@xxxxxxxxxxxxxx> wrote:
> I have been following ceph for a long time. I have yet to put it into
> service, and I keep coming back as btrfs improves and ceph reaches
> higher version numbers.
>
> I am now trying ceph 0.93 and kernel 4.0-rc1.
>
> Q1) Is it still considered that btrfs is not robust enough, and that
> xfs should be used instead? [I am trying with btrfs].

We are moving forward with btrfs on our production cluster aware that
there may be performance issues. So far, it seems the later kernels
have resolved the issues we've seen with snapshots. As the system
grows we will keep an eye on it and are prepared to move to XFS if
needed.

> I followed the manual deployment instructions on the web site
> (http://ceph.com/docs/master/install/manual-deployment/) and I managed
> to get a monitor and several osds running and apparently working. The
> instructions fizzle out without explaining how to set up mds. I went
> back to mkcephfs and got things set up that way. The mds starts.
>
> [Please don't mention ceph-deploy]
>
> The first thing that I noticed is that (whether I set up mon and osds
> by following the manual deployment, or using mkcephfs), the correct
> default pools were not created.

> bash-4.3# ceph osd lspools
> 0 rbd,
> bash-4.3#
>
>  I get only 'rbd' created automatically. I deleted this pool, and
>  re-created data, metadata and rbd manually. When doing this, I had to
>  juggle with the pg- num in order to avoid the 'too many pgs for osd'.
>  I have three osds running at the moment, but intend to add to these
>  when I have some experience of things working reliably. I am puzzled,
>  because I seem to have to set the pg-num for the pool to a number that
>  makes (N-pools x pg-num)/N-osds come to the right kind of number. So
>  this implies that I can't really expand a set of pools by adding osds
>  at a later date.
>
> Q2) Is there any obvious reason why my default pools are not getting
> created automatically as expected?

Since Giant, these pools are not automatically created, only the rbd pool is.

> Q3) Can pg-num be modified for a pool later? (If the number of osds is
> increased dramatically).

pg_num and pgp_num can be increased (not decreased) on the fly later
to expand with more OSDs.

> Finally, when I try to mount cephfs, I get a mount 5 error.
>
> "A mount 5 error typically occurs if a MDS server is laggy or if it
> crashed. Ensure at least one MDS is up and running, and the cluster is
> active + healthy".
>
> My mds is running, but its log is not terribly active:
>
> 2015-03-04 17:47:43.177349 7f42da2c47c0  0 ceph version 0.93
> (bebf8e9a830d998eeaab55f86bb256d4360dd3c4), process ceph-mds, pid 4110
> 2015-03-04 17:47:43.182716 7f42da2c47c0 -1 mds.-1.0 log_to_monitors
> {default=true}
>
> (This is all there is in the log).
>
> I think that a key indicator of the problem must be this from the
> monitor log:
>
> 2015-03-04 16:53:20.715132 7f3cd0014700  1
> mon.ceph-mon-00@0(leader).mds e1 warning, MDS mds.?
> [2001:8b0:xxxx:5fb3:xxxx:1fff:xxxx:9054]:6800/4036 up but filesystem
> disabled
>
> (I have added the 'xxxx' sections to obscure my ip address)
>
> Q4) Can you give me an idea of what is wrong that causes the mds to not
> play properly?
>
> I think that there are some typos on the manual deployment pages, for
> example:
>
> ceph-osd id={osd-num}
>
> This is not right. As far as I am aware it should be:
>
> ceph-osd -i {osd-num}

There are a few of these, usually running --help for the command gives
you the right syntax needed for the version you have installed. But it
is still very confusing.

> An observation. In principle, setting things up manually is not all
> that complicated, provided that clear and unambiguous instructions are
> provided. This simple piece of documentation is very important. My view
> is that the existing manual deployment instructions gets a bit confused
> and confusing when it gets to the osd setup, and the mds setup is
> completely absent.
>
> For someone who knows, this would be a fairly simple and fairly quick
> operation to review and revise this part of the documentation. I
> suspect that this part suffers from being really obvious stuff to the
> well initiated. For those of us closer to the start, this forms the
> ends of the threads that have to be picked up before the journey can be
> made.
>
> Very best regards,
> David
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com