Re: Crush rule freeze cluster

Timofey Titovets <nefelim4ag@xxxxxxxxx> · Mon, 11 May 2015 12:13:48 +0300

Hey! I catch it again. Its a kernel bug. Kernel crushed if i try to
map rbd device with map like above!
Hooray!

2015-05-11 12:11 GMT+03:00 Timofey Titovets <nefelim4ag@xxxxxxxxx>:
> FYI and history
> Rule:
> # rules
> rule replicated_ruleset {
>   ruleset 0
>   type replicated
>   min_size 1
>   max_size 10
>   step take default
>   step choose firstn 0 type room
>   step choose firstn 0 type rack
>   step choose firstn 0 type host
>   step chooseleaf firstn 0 type osd
>   step emit
> }
>
> And after reset node, i can't find any usable info. Cluster works fine
> and data just rebalanced by osd disks.
> syslog:
> May  9 19:30:02 srv-lab-ceph-node-01 systemd[1]: Reloading.
> May  9 19:30:02 srv-lab-ceph-node-01 systemd[1]: Starting Network Time
> Synchronization...
> May  9 19:30:02 srv-lab-ceph-node-01 systemd[1]: Started Network Time
> Synchronization.
> May  9 19:30:02 srv-lab-ceph-node-01 systemd[1]: Reloading.
> May  9 19:30:02 srv-lab-ceph-node-01 CRON[1731]: (CRON) info (No MTA
> installed, discarding output)
> May 11 11:54:57 srv-lab-ceph-node-01 rsyslogd: [origin
> software="rsyslogd" swVersion="7.4.4" x-pid="689"
> x-info="http://www.rsyslog.com";] start
> May 11 11:54:56 srv-lab-ceph-node-01 rsyslogd: rsyslogd's groupid changed to 103
> May 11 11:54:57 srv-lab-ceph-node-01 rsyslogd: rsyslogd's userid changed to 100
>
> Sorry for noise, guys. Georgios, in any way, thanks for helping.
>
> 2015-05-10 12:44 GMT+03:00 Georgios Dimitrakakis <giorgis@xxxxxxxxxxxx>:
>> Timofey,
>>
>> may be your best chance is to connect directly at the server and see what is
>> going on.
>> Then you can try debug why the problem occurred. If you don't want to wait
>> until tomorrow
>> you may try to see what is going on using the server's direct remote console
>> access.
>> The majority of the servers provide you with that just with a different name
>> each (DELL calls it iDRAC, Fujitsu iRMC, etc.) so if you have it up and
>> running you can use that.
>>
>> I think this should be your starting point and you can take it on from
>> there.
>>
>> I am sorry I cannot help you further with the Crush rules and the reason why
>> it crashed since I am far from being an expert in the field :-(
>>
>> Regards,
>>
>> George
>>
>>
>>> Georgios, oh, sorry for my poor english _-_, may be I poor expressed
>>> what i want =]
>>>
>>> i know how to write simple Crush rule and how use it, i want several
>>> things things:
>>> 1. Understand why, after inject bad map, my test node make offline.
>>> This is unexpected.
>>> 2. May be somebody can explain what and why happens with this map.
>>> 3. This is not a problem to write several crushmap or/and switch it
>>> while cluster running.
>>> But, in production, we have several nfs servers, i think about moving
>>> it to ceph, but i can't disable more then 1 server for maintenance
>>> simultaneously. I want avoid data disaster while setup and moving data
>>> to ceph, case like "Use local data replication, if only one node
>>> exist" looks usable as temporally solution, while i not add second
>>> node _-_.
>>> 4. May be some one also have test cluster and can test that happen
>>> with clients, if crushmap like it was injected.
>>>
>>> 2015-05-10 8:23 GMT+03:00 Georgios Dimitrakakis <giorgis@xxxxxxxxxxxx>:
>>>>
>>>> Hi Timofey,
>>>>
>>>> assuming that you have more than one OSD hosts and that the replicator
>>>> factor is equal (or less) to the number of the hosts why don't you just
>>>> change the crushmap to host replication?
>>>>
>>>> You just need to change the default CRUSHmap rule from
>>>>
>>>> step chooseleaf firstn 0 type osd
>>>>
>>>> to
>>>>
>>>> step chooseleaf firstn 0 type host
>>>>
>>>> I believe that this is the easiest way to do have replication across OSD
>>>> nodes unless you have a much more "sophisticated" setup.
>>>>
>>>> Regards,
>>>>
>>>> George
>>>>
>>>>
>>>>
>>>>> Hi list,
>>>>> i had experiments with crush maps, and I've try to get raid1 like
>>>>> behaviour (if cluster have 1 working osd node, duplicate data across
>>>>> local disk, for avoiding data lose in case local disk failure and
>>>>> allow client working, because this is not a degraded state)
>>>>> (
>>>>>   in best case, i want dynamic rule, like:
>>>>>   if has only one host -> spread data over local disks;
>>>>>   else if host count > 1 -> spread over hosts (rack o something else);
>>>>> )
>>>>>
>>>>> i write rule, like below:
>>>>>
>>>>> rule test {
>>>>>               ruleset 0
>>>>>               type replicated
>>>>>               min_size 0
>>>>>               max_size 10
>>>>>               step take default
>>>>>               step choose firstn 0 type host
>>>>>               step chooseleaf firstn 0 type osd
>>>>>               step emit
>>>>> }
>>>>>
>>>>> I've inject it in cluster and client node, now looks like have get
>>>>> kernel panic, I've lost my connection with it. No ssh, no ping, this
>>>>> is remote node and i can't see what happens until Monday.
>>>>> Yes, it looks like I've shoot in my foot.
>>>>> This is just a test setup and cluster destruction, not a problem, but
>>>>> i think, what broken rules, must not crush something else and in worst
>>>>> case, must be just ignored by cluster/crushtool compiler.
>>>>>
>>>>> May be someone can explain, how this rule can crush system? May be
>>>>> this is a crazy mistake somewhere?
>>>>
>>>>
>>>>
>>>> --
>>>> _______________________________________________
>>>> ceph-users mailing list
>>>> ceph-users@xxxxxxxxxxxxxx
>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>> --
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
> --
> Have a nice day,
> Timofey.

-- 
Have a nice day,
Timofey.
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com