On 04/05/2012 08:11 AM, dan clark wrote: > The corosync daemon v 1.99.9 when misconfigured using an older > configuration file may have problems on startup. If the value > "rrp_mode: active" is added to the example configuration the back trace > below is easily recreated, but perhaps this has already been addressed. > Without the "rrp_mode:" token in the example file the daemon started > up. In addition, the deamon does not seem to start with files valid in > previous releases in uidgid.d directory. > > Perhaps the parsing code could be augmented to identify the new > functionality requirements and default to reasonable values to aid in > migration from older configuration files to newer files? Would an > upgrade path be to install the new software and run on older That was the plan - perhaps backwards compatibility file testing wasn't sufficient... > configuration files be a reasonable requirement? Is it important to > use the "QUORUM" subsystem for logging and should it be used in older > releases? > > I apologize in advance if these issues were already discussed. > > dan > > > a) tried with a single uidgid.d file (see below). Is there a change in > the format of this file? > [root@tarn exec]# /usr/sbin/corosync -f > notice [MAIN ] Corosync Cluster Engine ('1.99.9'): started and ready > to provide service. > info [MAIN ] Corosync built-in features: > error [MAIN ] uidgid: Only uid and gid are allowed items > error [MAIN ] Corosync Cluster Engine exiting with status 8 at > main.c:1078. > > % cat /etc/corosync/uidgid.d/auser > uidgid { > uid: auser > gid: auser > } > b) tried removing all uidgid.d files and received the following crash: > (no interfaces defined for the stats structure) > (gdb) run -f > Starting program: /local/dclark/Downloads/ > corosync-1.99.9/exec/corosync -f > [Thread debugging using libthread_db enabled] > notice [MAIN ] Corosync Cluster Engine ('1.99.9'): started and ready > to provide service. > info [MAIN ] Corosync built-in features: > [New Thread 0x7ffff6955700 (LWP 27007)] > > Program received signal SIGSEGV, Segmentation fault. > active_instance_initialize (rrp_instance=0x740cf0, interface_count=1) > at totemrrp.c:1272 > 1272 stats_set_interface_faulty (rrp_instance, i, 0); > Missing separate debuginfos, use: debuginfo-install > glibc-2.12-1.47.el6_2.9.x86_64 nspr-4.8.9-3.el6_2.x86_64 > nss-3.13.1-7.el6_2.x86_64 nss-util-3.13.1-3.el6_2.x86_64 > zlib-1.2.3-25.el6.x86_64 > (gdb) where > #0 active_instance_initialize (rrp_instance=0x740cf0, interface_count=1) > at totemrrp.c:1272 > #1 0x00007ffff7dce2ea in totemrrp_algorithm_set (poll_handle=0x6f7870, > rrp_context=0x7ffff5f4a380, totem_config=0x7fffffffde60, > stats=<value optimized out>, context=0x7ffff5f18010, > deliver_fn=0x7ffff7dceb60 <main_deliver_fn>, > iface_change_fn=0x7ffff7dd1530 <main_iface_change_fn>, > token_seqid_get=0x7ffff7dce370 <main_token_seqid_get>, > msgs_missing=0x7ffff7dce390 <main_msgs_missing>, > target_set_completed=0x7ffff7dcfc00 <target_set_completed>) > at totemrrp.c:1694 > #2 totemrrp_initialize (poll_handle=0x6f7870, rrp_context=0x7ffff5f4a380, > totem_config=0x7fffffffde60, stats=<value optimized out>, > context=0x7ffff5f18010, deliver_fn=0x7ffff7dceb60 <main_deliver_fn>, > iface_change_fn=0x7ffff7dd1530 <main_iface_change_fn>, > token_seqid_get=0x7ffff7dce370 <main_token_seqid_get>, > msgs_missing=0x7ffff7dce390 <main_msgs_missing>, > target_set_completed=0x7ffff7dcfc00 <target_set_completed>) > at totemrrp.c:1887 > #3 0x00007ffff7dd0ef7 in totemsrp_initialize (poll_handle=0x6f7870, > srp_context=0x7ffff7fe6e88, totem_config=0x7fffffffde60, > stats=0x6ff3e0, > deliver_fn=0x7ffff7dd7aa0 <totemmrp_deliver_fn>, > confchg_fn=<value optimized out>) at totemsrp.c:934 > ---Type <return> to continue, or q <return> to quit--- > #4 0x00007ffff7dd8dc7 in totempg_initialize (poll_handle=0x6f7870, > totem_config=0x7fffffffde60) at totempg.c:757 > #5 0x0000000000417f70 in main (argc=<value optimized out>, > argv=<value optimized out>, envp=<value optimized out>) at main.c:1172 > (gdb) p *rrp_instance > $1 = {poll_handle = 0x0, interfaces = 0x0, rrp_algo = 0x7ffff7fe1a40, > context = 0x0, status = {0x0, 0x0}, totemrrp_deliver_fn = 0, > totemrrp_iface_change_fn = 0, totemrrp_token_seqid_get = 0, > totemrrp_target_set_completed = 0, totemrrp_msgs_missing = 0, > totemrrp_log_level_security = 0, totemrrp_log_level_error = 0, > totemrrp_log_level_warning = 0, totemrrp_log_level_notice = 0, > totemrrp_log_level_debug = 0, totemrrp_subsys_id = 0, > totemrrp_log_printf = 0, net_handles = 0x0, rrp_algo_instance = 0x0, > interface_count = 0, processor_count = 0, my_nodeid = 0, > totem_config = 0x7fffffffde60, deliver_fn_context = {0x0, 0x0}, > timer_active_test_ring_timeout = {0, 0}, stats = {hdr = {is_dirty = 0, > last_updated = 0}, net = 0x0, algo_name = 0x0, faulty = 0x0, > interface_count = 0}} > (gdb) > > Here is the new example configuration file modified with the rrp_mode to > cause the crash > > > # cat corosync.conf > # Please read the corosync.conf.5 manual page > totem { > version: 2 > rrp_mode: active > > # cypto_cipher and crypto_hash: Used for mutual node authentication. > # If you choose to enable this, then do remember to create a shared > # secret with "corosync-keygen". > crypto_cipher: none > crypto_hash: none > > # interface: define at least one interface to communicate > # over. If you define more than one interface stanza, you must > # also set rrp_mode. > interface { > # Rings must be consecutively numbered, starting at 0. > ringnumber: 0 > # This is normally the *network* address of the > # interface to bind to. This ensures that you can use > # identical instances of this configuration file > # across all your cluster nodes, without having to > # modify this option. > bindnetaddr: 10.109.20.0 > # However, if you have multiple physical network > # interfaces configured for the same subnet, then the > # network address alone is not sufficient to identify > # the interface Corosync should bind to. In that case, > # configure the *host* address of the interface > # instead: > # bindnetaddr: 192.168.1.1 > # When selecting a multicast address, consider RFC > # 2365 (which, among other things, specifies that > # 239.255.x.x addresses are left to the discretion of > # the network administrator). Do not reuse multicast > # addresses across multiple Corosync clusters sharing > # the same network. > # mcastaddr: 239.255.1.1 > mcastaddr: 239.192.105.99 > # Corosync uses the port you specify here for UDP > # messaging, and also the immediately preceding > # port. Thus if you set this to 5405, Corosync sends > # messages over UDP ports 5405 and 5404. > mcastport: 5405 > # Time-to-live for cluster communication packets. The > # number of hops (routers) that this ring will allow > # itself to pass. Note that multicast routing must be > # specifically enabled on most network routers. > ttl: 1 > } > } > > logging { > # Log the source file and line where messages are being > # generated. When in doubt, leave off. Potentially useful for > # debugging. > fileline: off > # Log to standard error. When in doubt, set to no. Useful when > # running in the foreground (when invoking "corosync -f") > to_stderr: yes > # Log to a log file. When set to "no", the "logfile" option > # must not be set. > to_logfile: yes > logfile: /var/log/cluster/corosync.log > # Log to the system log daemon. When in doubt, set to yes. > to_syslog: yes > # Log debug messages (very verbose). When in doubt, leave off. > debug: off > # Log messages with time stamps. When in doubt, set to on > # (unless you are only logging to syslog, where double > # timestamps can be annoying). > timestamp: on > logger_subsys { > subsys: QUORUM > debug: off > } > } > > > Note, is an example of an older configuration file. > # > # configuration for corosync > # Please read the corosync.conf.5 manual page > # > compatibility: whitetank > > totem { > version: 2 > rrp_mode: active > interface { > ringnumber: 0 > bindnetaddr: 10.0.0.0 > mcastaddr: 239.192.109.99 > mcastport: 5407 > } > } > > logging { > timestamp: on > fileline: on > function_name: on > to_stderr: yes > to_logfile: no > to_syslog: yes > logfile: /var/log/corosync > # logfile_priority - alert|crit|debug|emerg|err|info|notice|warning > # logfile_priority: info > # syslog_facility - daemon, local0, ... local7 > # syslog_priority: info > debug: off > trace: none|enter|leave|trace1|trace2|trace3 > logger_subsys { > subsys: AMF > debug: off > } > } > > amf { > mode: disabled > } > > > _______________________________________________ > discuss mailing list > discuss@xxxxxxxxxxxx > http://lists.corosync.org/mailman/listinfo/discuss _______________________________________________ discuss mailing list discuss@xxxxxxxxxxxx http://lists.corosync.org/mailman/listinfo/discuss