Re: Need help from Ceph experts

Craig Lewis <clewis@xxxxxxxxxxxxxxxxxx> · Fri, 19 Dec 2014 10:30:21 -0800

I've done single nodes.  I have a couple VMs for RadosGW Federation testing.  It has a single virtual network, with both "clusters" on the same network.
Because I'm only using a single OSD on a single host, I had to update the crushmap to handle that.  My Chef recipe runs:
ceph osd getcrushmap -o /tmp/compiled-crushmap.old

crushtool -d /tmp/compiled-crushmap.old -o /tmp/decompiled-crushmap.old

sed -e '/step chooseleaf firstn 0 type/s/host/osd/' /tmp/decompiled-crushmap.old > /tmp/decompiled-crushmap.new

crushtool -c /tmp/decompiled-crushmap.new -o /tmp/compiled-crushmap.new

ceph osd setcrushmap -i /tmp/compiled-crushmap.new

Those are the only extra commands I run for a single node cluster.  Otherwise, it looks the same as my production nodes that run mon, osd, and rgw.

Here's my single node's ceph.conf:
[global]
  fsid = a7798848-1d31-421b-8f3c-5a34d60f6579
  mon initial members = test0-ceph0
  mon host = 172.16.205.143:6789
  auth client required = none
  auth cluster required = none
  auth service required = none
  mon warn on legacy crush tunables = false
  osd crush chooseleaf type = 0
  osd pool default flag hashpspool = true
  osd pool default min size = 1
  osd pool default size = 1
  public network = 172.16.205.0/24

[osd]
  osd journal size = 1000
  osd mkfs options xfs = -s size=4096
  osd mkfs type = xfs
  osd mount options xfs = rw,noatime,nodiratime,nosuid,noexec,inode64
  osd_scrub_sleep = 1.0
  osd_snap_trim_sleep = 1.0

[client.radosgw.test0-ceph0]
  host = test0-ceph0
  rgw socket path = /var/run/ceph/radosgw.test0-ceph0
  keyring = /etc/ceph/ceph.client.radosgw.test0-ceph0.keyring
  log file = /var/log/ceph/radosgw.log
  admin socket = /var/run/ceph/radosgw.asok
  rgw dns name = test0-ceph
  rgw region = us
  rgw region root pool = .us.rgw.root
  rgw zone = us-west
  rgw zone root pool = .us-west.rgw.root

On Thu, Dec 18, 2014 at 11:23 PM, Debashish Das <deba.daz@xxxxxxxxx> wrote:Hi Team,

Thank for the insight & the replies, as I understood from the mails - running Ceph cluster in a single node is possible but definitely not recommended.

The challenge which i see is there is no clear documentation for single node installation.

So I would request if anyone has installed Ceph in single node, please share the link or document which i can refer to install Ceph in my local server.

Again thanks guys !!

Kind Regards
Debashish Das

On Fri, Dec 19, 2014 at 6:08 AM, Robert LeBlanc <robert@xxxxxxxxxxxxx> wrote:Thanks, I'll look into these.

On Thu, Dec 18, 2014 at 5:12 PM, Craig Lewis <clewis@xxxxxxxxxxxxxxxxxx> wrote:I think this is it:  https://engage.redhat.com/inktank-ceph-reference-architecture-s-201409080939
You can also check out a presentation on Cern's Ceph cluster: http://www.slideshare.net/Inktank_Ceph/scaling-ceph-at-cern

At large scale, the biggest problem will likely be network I/O on the inter-switch links.

On Thu, Dec 18, 2014 at 3:29 PM, Robert LeBlanc <robert@xxxxxxxxxxxxx> wrote:I'm interested to know if there is a reference to this reference architecture. It would help alleviate some of the fears we have about scaling this thing to a massive scale (10,000's OSDs).
Thanks,
Robert LeBlanc

On Thu, Dec 18, 2014 at 3:43 PM, Craig Lewis <clewis@xxxxxxxxxxxxxxxxxx> wrote:

On Thu, Dec 18, 2014 at 5:16 AM, Patrick McGarry <patrick@xxxxxxxxxxx> wrote:

> 2. What should be the minimum hardware requirement of the server (CPU,

> Memory, NIC etc)

There is no real "minimum" to run Ceph, it's all about what your

workload will look like and what kind of performance you need. We have

seen Ceph run on Raspberry Pis.
 Technically, the smallest cluster is a single node with a 10 GiB disk.  Anything smaller won't work.

That said, Ceph was envisioned to run on large clusters.  IIRC, the reference architecture has 7 rows, each row having 10 racks, all full.

Those of us running small clusters (less than 10 nodes) are noticing that it doesn't work quite as well.  We have to significantly scale back the amount of backfilling and recovery that is allowed.  I try to keep all backfill/recovery operations touching less than 20% of my OSDs.  In the reference architecture, it could lose a whole row, and still keep under that limit.  My 5 nodes cluster is noticeably better better than the 3 node cluster.  It's faster, has lower latency, and latency doesn't increase as much during recovery operations.

_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com