Re: Crush rule freeze cluster

Georgios Dimitrakakis <giorgis@xxxxxxxxxxxx> · Mon, 11 May 2015 12:40:40 +0300

Timofey,

glad that you 've managed to get it working :-)

Best,

George

FYI and history
Rule:
# rules
rule replicated_ruleset {
  ruleset 0
  type replicated
  min_size 1
  max_size 10
  step take default
  step choose firstn 0 type room
  step choose firstn 0 type rack
  step choose firstn 0 type host
  step chooseleaf firstn 0 type osd
  step emit
}

And after reset node, i can't find any usable info. Cluster works 
fine
and data just rebalanced by osd disks.
syslog:
May  9 19:30:02 srv-lab-ceph-node-01 systemd[1]: Reloading.
May  9 19:30:02 srv-lab-ceph-node-01 systemd[1]: Starting Network 
Time
Synchronization...
May  9 19:30:02 srv-lab-ceph-node-01 systemd[1]: Started Network Time
Synchronization.
May  9 19:30:02 srv-lab-ceph-node-01 systemd[1]: Reloading.
May  9 19:30:02 srv-lab-ceph-node-01 CRON[1731]: (CRON) info (No MTA
installed, discarding output)
May 11 11:54:57 srv-lab-ceph-node-01 rsyslogd: [origin
software="rsyslogd" swVersion="7.4.4" x-pid="689"
x-info="http://www.rsyslog.com";] start
May 11 11:54:56 srv-lab-ceph-node-01 rsyslogd: rsyslogd's groupid
changed to 103
May 11 11:54:57 srv-lab-ceph-node-01 rsyslogd: rsyslogd's userid
changed to 100

Sorry for noise, guys. Georgios, in any way, thanks for helping.

2015-05-10 12:44 GMT+03:00 Georgios Dimitrakakis 
<giorgis@xxxxxxxxxxxx>:
Timofey,

may be your best chance is to connect directly at the server and see 
what is
going on.
Then you can try debug why the problem occurred. If you don't want 
to wait
until tomorrow
you may try to see what is going on using the server's direct remote 
console
access.
The majority of the servers provide you with that just with a 
different name
each (DELL calls it iDRAC, Fujitsu iRMC, etc.) so if you have it up 
and
running you can use that.

I think this should be your starting point and you can take it on 
from
there.

I am sorry I cannot help you further with the Crush rules and the 
reason why
it crashed since I am far from being an expert in the field :-(

Regards,

George

Georgios, oh, sorry for my poor english _-_, may be I poor 
expressed
what i want =]

i know how to write simple Crush rule and how use it, i want 
several
things things:
1. Understand why, after inject bad map, my test node make offline.
This is unexpected.
2. May be somebody can explain what and why happens with this map.
3. This is not a problem to write several crushmap or/and switch it
while cluster running.
But, in production, we have several nfs servers, i think about 
moving
it to ceph, but i can't disable more then 1 server for maintenance
simultaneously. I want avoid data disaster while setup and moving 
data
to ceph, case like "Use local data replication, if only one node
exist" looks usable as temporally solution, while i not add second
node _-_.
4. May be some one also have test cluster and can test that happen
with clients, if crushmap like it was injected.

2015-05-10 8:23 GMT+03:00 Georgios Dimitrakakis 
<giorgis@xxxxxxxxxxxx>:

Hi Timofey,

assuming that you have more than one OSD hosts and that the 
replicator
factor is equal (or less) to the number of the hosts why don't you 
just
change the crushmap to host replication?

You just need to change the default CRUSHmap rule from

step chooseleaf firstn 0 type osd

to

step chooseleaf firstn 0 type host

I believe that this is the easiest way to do have replication 
across OSD
nodes unless you have a much more "sophisticated" setup.

Regards,

George

Hi list,
i had experiments with crush maps, and I've try to get raid1 like
behaviour (if cluster have 1 working osd node, duplicate data 
across
local disk, for avoiding data lose in case local disk failure and
allow client working, because this is not a degraded state)
(
  in best case, i want dynamic rule, like:
  if has only one host -> spread data over local disks;
  else if host count > 1 -> spread over hosts (rack o something 
else);
)

i write rule, like below:

rule test {
              ruleset 0
              type replicated
              min_size 0
              max_size 10
              step take default
              step choose firstn 0 type host
              step chooseleaf firstn 0 type osd
              step emit
}

I've inject it in cluster and client node, now looks like have 
get
kernel panic, I've lost my connection with it. No ssh, no ping, 
this
is remote node and i can't see what happens until Monday.
Yes, it looks like I've shoot in my foot.
This is just a test setup and cluster destruction, not a problem, 
but
i think, what broken rules, must not crush something else and in 
worst
case, must be just ignored by cluster/crushtool compiler.

May be someone can explain, how this rule can crush system? May 
be
this is a crazy mistake somewhere?

--
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

--
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com