Re: CEPH Expansion

Georgios Dimitrakakis <giorgis@xxxxxxxxxxxx> · Sun, 25 Jan 2015 20:41:15 +0200

Hi Craig!

Indeed I had reduced the replicated size to 2 instead of 3 while the 
minimum size is 1.

I hadn't touched the crushmap though.

I would like to keep on going with the replicated size of 2 . Do you 
think this would be a problem?

Please find below the output of the command:

$ ceph osd dump | grep ^pool
pool 3 'data' replicated size 2 min_size 1 crush_ruleset 0 object_hash 
rjenkins pg_num 512 pgp_num 512 last_change 524 flags hashpspool 
stripe_width 0
pool 4 'metadata' replicated size 2 min_size 1 crush_ruleset 0 
object_hash rjenkins pg_num 512 pgp_num 512 last_change 526 flags 
hashpspool stripe_width 0
pool 5 'rbd' replicated size 2 min_size 1 crush_ruleset 0 object_hash 
rjenkins pg_num 512 pgp_num 512 last_change 528 flags hashpspool 
stripe_width 0
pool 6 '.rgw' replicated size 2 min_size 1 crush_ruleset 0 object_hash 
rjenkins pg_num 512 pgp_num 512 last_change 618 flags hashpspool 
stripe_width 0
pool 7 '.rgw.control' replicated size 2 min_size 1 crush_ruleset 0 
object_hash rjenkins pg_num 512 pgp_num 512 last_change 616 flags 
hashpspool stripe_width 0
pool 8 '.rgw.gc' replicated size 2 min_size 1 crush_ruleset 0 
object_hash rjenkins pg_num 512 pgp_num 512 last_change 614 flags 
hashpspool stripe_width 0
pool 9 '.log' replicated size 2 min_size 1 crush_ruleset 0 object_hash 
rjenkins pg_num 512 pgp_num 512 last_change 612 flags hashpspool 
stripe_width 0
pool 10 '.intent-log' replicated size 2 min_size 1 crush_ruleset 0 
object_hash rjenkins pg_num 512 pgp_num 512 last_change 610 flags 
hashpspool stripe_width 0
pool 11 '.usage' replicated size 2 min_size 1 crush_ruleset 0 
object_hash rjenkins pg_num 512 pgp_num 512 last_change 608 flags 
hashpspool stripe_width 0
pool 12 '.users' replicated size 2 min_size 1 crush_ruleset 0 
object_hash rjenkins pg_num 512 pgp_num 512 last_change 606 flags 
hashpspool stripe_width 0
pool 13 '.users.email' replicated size 2 min_size 1 crush_ruleset 0 
object_hash rjenkins pg_num 512 pgp_num 512 last_change 604 flags 
hashpspool stripe_width 0
pool 14 '.users.swift' replicated size 2 min_size 1 crush_ruleset 0 
object_hash rjenkins pg_num 512 pgp_num 512 last_change 602 flags 
hashpspool stripe_width 0
pool 15 '.users.uid' replicated size 2 min_size 1 crush_ruleset 0 
object_hash rjenkins pg_num 512 pgp_num 512 last_change 600 flags 
hashpspool stripe_width 0
pool 16 '.rgw.root' replicated size 2 min_size 1 crush_ruleset 0 
object_hash rjenkins pg_num 512 pgp_num 512 last_change 598 flags 
hashpspool stripe_width 0
pool 17 '.rgw.buckets.index' replicated size 2 min_size 1 crush_ruleset 
0 object_hash rjenkins pg_num 512 pgp_num 512 last_change 596 flags 
hashpspool stripe_width 0
pool 18 '.rgw.buckets' replicated size 2 min_size 1 crush_ruleset 0 
object_hash rjenkins pg_num 1024 pgp_num 1024 last_change 826 flags 
hashpspool stripe_width 0
pool 19 '.rgw.buckets.extra' replicated size 2 min_size 1 crush_ruleset 
0 object_hash rjenkins pg_num 512 pgp_num 512 last_change 722 owner 
18446744073709551615 flags hashpspool stripe_width 0

Warmest regards,

George

Youve either modified the crushmap, or changed the pool size to 1. 
The defaults create 3 replicas on different hosts.

What does `ceph osd dump | grep ^pool` output?  If the size param is
1, then you reduced the replica count.  If the size param is > 1, you
mustve adjusted the crushmap.

Either way, after you add the second node would be the ideal time to
change that back to the default.

Given that you only have 40GB of data in the cluster, you shouldnt
have a problem adding the 2nd node.

On Fri, Jan 23, 2015 at 3:58 PM, Georgios Dimitrakakis  wrote:

Hi Craig!

For the moment I have only one node with 10 OSDs.
I want to add a second one with 10 more OSDs.

Each OSD in every node is a 4TB SATA drive. No SSD disks!

The data ara approximately 40GB and I will do my best to have zero
or at least very very low load during the expansion process.

To be honest I havent touched the crushmap. I wasnt aware that I
should have changed it. Therefore, it still is with the default
one.
Is that OK? Where can I read about the host level replication in
CRUSH map in order
to make sure that its applied or how can I find if this is already
enabled?

Any other things that I should be aware of?

All the best,

George

It depends.  There are a lot of variables, like how many nodes
and
disks you currently have.  Are you using journals on SSD.  How
much
data is already in the cluster.  What the client load is on the
cluster.

Since you only have 40 GB in the cluster, it shouldnt take long
to
backfill.  You may find that it finishes backfilling faster than
you
can format the new disks.

Since you only have a single OSD node, you mustve changed the
crushmap
to allow replication over OSDs instead of hosts.  After you get
the
new node in would be the best time to switch back to host level
replication.  The more data you have, the more painful that
change
will become.

On Sun, Jan 18, 2015 at 10:09 AM, Georgios Dimitrakakis  wrote:

Hi Jiri,

thanks for the feedback.

My main concern is if its better to add each OSD one-by-one and
wait for the cluster to rebalance every time or do it
all-together
at once.

Furthermore an estimate of the time to rebalance would be
great!

Regards,

Links:
------
[1] mailto:giorgis@xxxxxxxxxxxx [1]

--

Links:
------
[1] mailto:giorgis@xxxxxxxxxxxx
[2] mailto:giorgis@xxxxxxxxxxxx

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com