startup considerations for v 2.x

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The corosync daemon v 1.99.9 when misconfigured using an older configuration file may have problems on startup.  If the value "rrp_mode: active" is added to the example configuration the back trace below is easily recreated, but perhaps this has already been addressed.  Without the "rrp_mode:" token in the example file the daemon started up.  In addition, the deamon does not seem to start with files valid in previous releases in uidgid.d directory. 

Perhaps the parsing code could be augmented to identify the new functionality requirements and default to reasonable values to aid in migration from older configuration files to newer files?   Would an upgrade path be to install the new software and run on older configuration files be a reasonable requirement?   Is it important to use the "QUORUM" subsystem for logging and should it be used in older releases?

I apologize in advance if these issues were already discussed.

dan


a) tried with a single uidgid.d file (see below).  Is there a change in the format of this file?
[root@tarn exec]# /usr/sbin/corosync -f
notice  [MAIN  ] Corosync Cluster Engine ('1.99.9'): started and ready to provide service.
info    [MAIN  ] Corosync built-in features:
error   [MAIN  ] uidgid: Only uid and gid are allowed items
error   [MAIN  ] Corosync Cluster Engine exiting with status 8 at main.c:1078.

% cat /etc/corosync/uidgid.d/auser
uidgid {
    uid: auser
    gid: auser
}
b) tried removing all uidgid.d files and received the following crash:
(no interfaces defined for the stats structure)
(gdb) run -f
Starting program: /local/dclark/Downloads/
corosync-1.99.9/exec/corosync -f
[Thread debugging using libthread_db enabled]
notice  [MAIN  ] Corosync Cluster Engine ('1.99.9'): started and ready to provide service.
info    [MAIN  ] Corosync built-in features:
[New Thread 0x7ffff6955700 (LWP 27007)]

Program received signal SIGSEGV, Segmentation fault.
active_instance_initialize (rrp_instance=0x740cf0, interface_count=1)
    at totemrrp.c:1272
1272            stats_set_interface_faulty (rrp_instance, i, 0);
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.47.el6_2.9.x86_64 nspr-4.8.9-3.el6_2.x86_64 nss-3.13.1-7.el6_2.x86_64 nss-util-3.13.1-3.el6_2.x86_64 zlib-1.2.3-25.el6.x86_64
(gdb) where
#0  active_instance_initialize (rrp_instance=0x740cf0, interface_count=1)
    at totemrrp.c:1272
#1  0x00007ffff7dce2ea in totemrrp_algorithm_set (poll_handle=0x6f7870,
    rrp_context=0x7ffff5f4a380, totem_config=0x7fffffffde60,
    stats=<value optimized out>, context=0x7ffff5f18010,
    deliver_fn=0x7ffff7dceb60 <main_deliver_fn>,
    iface_change_fn=0x7ffff7dd1530 <main_iface_change_fn>,
    token_seqid_get=0x7ffff7dce370 <main_token_seqid_get>,
    msgs_missing=0x7ffff7dce390 <main_msgs_missing>,
    target_set_completed=0x7ffff7dcfc00 <target_set_completed>)
    at totemrrp.c:1694
#2  totemrrp_initialize (poll_handle=0x6f7870, rrp_context=0x7ffff5f4a380,
    totem_config=0x7fffffffde60, stats=<value optimized out>,
    context=0x7ffff5f18010, deliver_fn=0x7ffff7dceb60 <main_deliver_fn>,
    iface_change_fn=0x7ffff7dd1530 <main_iface_change_fn>,
    token_seqid_get=0x7ffff7dce370 <main_token_seqid_get>,
    msgs_missing=0x7ffff7dce390 <main_msgs_missing>,
    target_set_completed=0x7ffff7dcfc00 <target_set_completed>)
    at totemrrp.c:1887
#3  0x00007ffff7dd0ef7 in totemsrp_initialize (poll_handle=0x6f7870,
    srp_context=0x7ffff7fe6e88, totem_config=0x7fffffffde60, stats=0x6ff3e0,
    deliver_fn=0x7ffff7dd7aa0 <totemmrp_deliver_fn>,
    confchg_fn=<value optimized out>) at totemsrp.c:934
---Type <return> to continue, or q <return> to quit---
#4  0x00007ffff7dd8dc7 in totempg_initialize (poll_handle=0x6f7870,
    totem_config=0x7fffffffde60) at totempg.c:757
#5  0x0000000000417f70 in main (argc=<value optimized out>,
    argv=<value optimized out>, envp=<value optimized out>) at main.c:1172
(gdb) p *rrp_instance
$1 = {poll_handle = 0x0, interfaces = 0x0, rrp_algo = 0x7ffff7fe1a40,
  context = 0x0, status = {0x0, 0x0}, totemrrp_deliver_fn = 0,
  totemrrp_iface_change_fn = 0, totemrrp_token_seqid_get = 0,
  totemrrp_target_set_completed = 0, totemrrp_msgs_missing = 0,
  totemrrp_log_level_security = 0, totemrrp_log_level_error = 0,
  totemrrp_log_level_warning = 0, totemrrp_log_level_notice = 0,
  totemrrp_log_level_debug = 0, totemrrp_subsys_id = 0,
  totemrrp_log_printf = 0, net_handles = 0x0, rrp_algo_instance = 0x0,
  interface_count = 0, processor_count = 0, my_nodeid = 0,
  totem_config = 0x7fffffffde60, deliver_fn_context = {0x0, 0x0},
  timer_active_test_ring_timeout = {0, 0}, stats = {hdr = {is_dirty = 0,
      last_updated = 0}, net = 0x0, algo_name = 0x0, faulty = 0x0,
    interface_count = 0}}
(gdb)

Here is the new example configuration file modified with the rrp_mode to cause the crash


# cat corosync.conf
# Please read the corosync.conf.5 manual page
totem {
    version: 2
    rrp_mode: active

    # cypto_cipher and crypto_hash: Used for mutual node authentication.
    # If you choose to enable this, then do remember to create a shared
    # secret with "corosync-keygen".
    crypto_cipher: none
    crypto_hash: none

    # interface: define at least one interface to communicate
    # over. If you define more than one interface stanza, you must
    # also set rrp_mode.
    interface {
                # Rings must be consecutively numbered, starting at 0.
        ringnumber: 0
        # This is normally the *network* address of the
        # interface to bind to. This ensures that you can use
        # identical instances of this configuration file
        # across all your cluster nodes, without having to
        # modify this option.
        bindnetaddr: 10.109.20.0
        # However, if you have multiple physical network
        # interfaces configured for the same subnet, then the
        # network address alone is not sufficient to identify
        # the interface Corosync should bind to. In that case,
        # configure the *host* address of the interface
        # instead:
        # bindnetaddr: 192.168.1.1
        # When selecting a multicast address, consider RFC
        # 2365 (which, among other things, specifies that
        # 239.255.x.x addresses are left to the discretion of
        # the network administrator). Do not reuse multicast
        # addresses across multiple Corosync clusters sharing
        # the same network.
        # mcastaddr: 239.255.1.1
        mcastaddr: 239.192.105.99
        # Corosync uses the port you specify here for UDP
        # messaging, and also the immediately preceding
        # port. Thus if you set this to 5405, Corosync sends
        # messages over UDP ports 5405 and 5404.
        mcastport: 5405
        # Time-to-live for cluster communication packets. The
        # number of hops (routers) that this ring will allow
        # itself to pass. Note that multicast routing must be
        # specifically enabled on most network routers.
        ttl: 1
    }
}

logging {
    # Log the source file and line where messages are being
    # generated. When in doubt, leave off. Potentially useful for
    # debugging.
    fileline: off
    # Log to standard error. When in doubt, set to no. Useful when
    # running in the foreground (when invoking "corosync -f")
    to_stderr: yes
    # Log to a log file. When set to "no", the "logfile" option
    # must not be set.
    to_logfile: yes
    logfile: /var/log/cluster/corosync.log
    # Log to the system log daemon. When in doubt, set to yes.
    to_syslog: yes
    # Log debug messages (very verbose). When in doubt, leave off.
    debug: off
    # Log messages with time stamps. When in doubt, set to on
    # (unless you are only logging to syslog, where double
    # timestamps can be annoying).
    timestamp: on
    logger_subsys {
        subsys: QUORUM
        debug: off
    }
}


Note, is an example of an older configuration file.
#
# configuration for corosync
# Please read the corosync.conf.5 manual page
#
compatibility: whitetank

totem {
        version: 2
        rrp_mode: active
        interface {
                ringnumber: 0
                bindnetaddr: 10.0.0.0
                mcastaddr: 239.192.109.99
                mcastport: 5407
        }
}

logging {
        timestamp: on
        fileline: on
        function_name: on
        to_stderr: yes
        to_logfile: no
        to_syslog: yes
        logfile: /var/log/corosync
        # logfile_priority - alert|crit|debug|emerg|err|info|notice|warning
#       logfile_priority: info
        # syslog_facility - daemon, local0, ... local7
#       syslog_priority: info
        debug: off
        trace: none|enter|leave|trace1|trace2|trace3
        logger_subsys {
                subsys: AMF
                debug: off
        }
}

amf {
        mode: disabled
}
_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss

[Index of Archives]     [Linux Clusters]     [Corosync Project]     [Linux USB Devel]     [Linux Audio Users]     [Photo]     [Yosemite News]    [Yosemite Photos]    [Linux Kernel]     [Linux SCSI]     [X.Org]

  Powered by Linux