ccs_config_validate in cluster 3.0.X

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi everybody,

as briefly mentioned in 3.0.4 release note, a new system to validate the
configuration has been enabled in the code.

What it does
------------

The general idea is to be able to perform as many sanity checks on the
configuration as possible. This check allows us to spot the most common
mistakes, such as typos or possibly invalid values, in cluster.conf.


Configuring the validation
--------------------------

The validation system is integrated in several components.
It supports one config option that can take 3 values.

Via init script (or /etc/sysconfig/cman or distro equivalent):

CONFIG_VALIDATION=value

values can be:
1) FAIL - enables a very strict check. Even a simple typo will fail to
load the configuration.

2) WARN - the check is relaxed. Warnings are printed on the screen, but
the cluster will continue to load. (default)

3) NONE - disable the config validation system. (discouraged!)

this is equivalent to:
cman_tool join/version -D(FAIL|WARN|NONE)


What a user sees
----------------

The output of the validation process is very cryptic. Yes we are
absolutely aware of that and we are working on making it easy to
understand (if anybody has relax-ng experience, please contact us).

This is the typical output from a normal startup (configuration contains
no errors or warnings):

[root@fedora-rh-node1 ~]# /etc/init.d/cman start join
Starting cluster:
   Global setup...                                         [  OK  ]
   Loading kernel modules...                               [  OK  ]
   Mounting configfs...                                    [  OK  ]
   Setting network parameters...                           [  OK  ]
   Starting cman...                                        [  OK  ]
[root@fedora-rh-node1 ~]#

This is the output with a typo in cluster.conf (running in WARN mode):

[root@fedora-rh-node1 ~]# /etc/init.d/cman start join
Starting cluster:
   Global setup...                                         [  OK  ]
   Loading kernel modules...                               [  OK  ]
   Mounting configfs...                                    [  OK  ]
   Setting network parameters...                           [  OK  ]
   Starting cman... tempfile:22: element quorum: Relax-NG validity error
: Element cluster has extra content: quorum
Configuration fails to validate
                                                           [  OK  ]

The error in this specific case is that quorum element is wrong and
should be quorumd.. (for qdisk).

As you can see yourself, the output is not easy to understand without a
good understanding of Relax-NG.

The check also happens before configuration updates using via cman_tool
version. Here are 3 examples (i use -S to disable configuration
synchronization on my systems):

[root@fedora-rh-node1 ~]# cman_tool version -r 2 -S
[root@fedora-rh-node1 ~]#

cman_tool defaults to strict check, the same typo as above will abort
the configuration reload:

[root@fedora-rh-node4 ~]# cman_tool version -r 3 -S
tempfile:22: element quorum: Relax-NG validity error : Element cluster
has extra content: quorum
Configuration fails to validate
cman_tool: Not reloading, configuration is not valid

Disable the strict check and turn errors into warnings:

[root@fedora-rh-node1 ~]# cman_tool version -r 3 -S -DWARN
tempfile:22: element quorum: Relax-NG validity error : Element cluster
has extra content: quorum
Configuration fails to validate
[root@fedora-rh-node1 ~]#


What to do if there are errors
------------------------------

First of all do NOT panic.

This check integration is new and there might be several reasons why you
see a warning (including bugs in the validation schema).

Users with XML and Relax-NG experience should be able to sort it out simply.

For all the others we strongly recommend you to file a bug on
bugzilla.redhat.com, including /etc/cluster/cluster.conf _AND_
/usr/share/cluster/cluster.rng.

This will allow us to cross check bugs in our validation code/schema and
help users fixing their configuration files.


Using ccs_config_validate standalone command
--------------------------------------------

Validation of a configuration is an important step.

ccs_config_validate is a very powerful and flexible tool, but requires
understanding of the config subsystem to be used correctly.

The general/average user can simply invoke ccs_config_validate with no
options and will see the same results as when invoked via cman_tool.
This is achieved by loading the same environment variables as cman
init script and respecting those selections, it will perform the
required actions.

There are advanced use cases and usage of the tool, for example to
migrate from one config subsystem to another (cluster.conf to ldap for
example), but, generally, anyone who needs to do changes of this
magnitude is also expected to have a good understanding of the
configuration subsystem (a new document will be available shortly for
both developers and advanced users).

Please do not hesitate to ask for clarifications or report bugs.

Cheers
Fabio


--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux