Hi all,
I'm doing some tests on OCFS2 with a 2.6.32-100 kernel (Oracle) or
RHEL6/fedora and I have a hang in lowcomms.c as you can see below.
I have a crash dump if you need more information. I'm lost and I need
help to know where to search to debug this problem.
Thanks
Regards,
Benoit
Kernel 2.6.32-100.0.19.el5 on an x86_64
chili0 login: ------------[ cut here ]------------
kernel BUG at fs/dlm/lowcomms.c:647!
invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/kernel/dlm/14E8093BB71D447EBEE691622CF86B9C/control
CPU 34
Modules linked in: ocfs2(U) ocfs2_nodemanager(U) nfsd(U) exportfs(U)
sctp(U) libcrc32c(U) ocfs2_stack_user(U) ocfs2_stackglue(U) dlm(U)
configfs(U) acpi_cpufreq(U) freq_table(U) ipmi_devintf(U) ipmi_si(U)
ipmi_msghandler(U) nfs(U) lockd(U) fscache(U) nfs_acl(U)
auth_rpcgss(U) sunrpc(U) ipv6(U) scsi_dh_emc(U) dm_round_robin(U)
dm_multipath(U) iTCO_wdt(U) iTCO_vendor_support(U) mlx4_core(U)
i2c_i801(U) igb(U) pcspkr(U) i2c_core(U) ioatdma(U) dca(U) ahci(U)
uhci_hcd(U) ehci_hcd(U) lpfc(U) scsi_transport_fc(U) scsi_tgt(U) [last
unloaded: ocfs2_nodemanager]
Pid: 27062, comm: dlm_recv/34 Not tainted 2.6.32-100.0.19.el5 #1 bullx
super-node
RIP: 0010:[<ffffffffa02406c3>] [<ffffffffa02406c3>]
receive_from_sock+0x554/0x6ed [dlm]
RSP: 0018:ffff880c77c6bc60 EFLAGS: 00010246
RAX: 0000000000000030 RBX: ffff8810774b8d30 RCX: ffff88087c4548f8
RDX: 0000000000000030 RSI: ffff880876dce000 RDI: ffffffff81398045
RBP: ffff880c77c6be50 R08: ffff000000000000 R09: ffff880c77c6b900
R10: ffff880c77c6b8f0 R11: 0000000000000030 R12: 0000000000000030
R13: ffff8810774b8d20 R14: ffff880c7caa00c0 R15: ffffffffa023ecca
FS: 0000000000000000(0000) GS:ffff88048e600000(0000)
knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000fcb078 CR3: 0000000001001000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process dlm_recv/34 (pid: 27062, threadinfo ffff880c77c6a000, task
ffff880c7caa00c0)
Stack:
ffff880c77c6bc70 ffffffff8122fa24 ffff880c77c6bc90 ffffffff8122faca
<0> ffff88048e414ec0 0000100000000002 0000000000000000 ffffffff00000000
<0> 0000000000000000 0000000000000000 ffffffffa024bb20 0000000000000030
Call Trace:
[<ffffffff8122fa24>] ? cpumask_next+0x19/0x1b
[<ffffffff8122faca>] ? cpumask_next_and+0x20/0x32
[<ffffffffa023ecca>] ? process_recv_sockets+0x0/0x28 [dlm]
[<ffffffffa023ecea>] process_recv_sockets+0x20/0x28 [dlm]
[<ffffffff81071802>] worker_thread+0x14d/0x1ed
[<ffffffff81075a7c>] ? autoremove_wake_function+0x0/0x3d
[<ffffffff810716b5>] ? worker_thread+0x0/0x1ed
[<ffffffff810756d3>] kthread+0x6e/0x76
[<ffffffff81012dea>] child_rip+0xa/0x20
[<ffffffff81075665>] ? kthread+0x0/0x76
[<ffffffff81012de0>] ? child_rip+0x0/0x20
Code: 29 e7 ff ff e9 2d 01 00 00 41 8b 74 24 10 0f b7 d0 48 c7 c7 d1
8c 24 a0 31 c0 e8 ab 71 e1 e0 e9 12 01 00 00 41 83 7d 08 00 75 04 <0f>
0b eb fe 4d 8d 7d 68 49 be 00 00 00 00 00 16 00 00 41 8b 55
RIP [<ffffffffa02406c3>] receive_from_sock+0x554/0x6ed [dlm]
RSP <ffff880c77c6bc60>
Initializing cgroup subsys cpuset
Initializing cgroup subsys cpu
Linux version 2.6.32-100.0.19.el5 (mockbuild@xxxxxxxxxxxxxxxxxxxxxxx)
(gcc version 4.1.2 20080704 (Red Hat 4.1.2-48)) #1 SMP Fri Sep 17
17:51:41 EDT 2010
Command line: ro root=/dev/mapper/vg_chili0-lv_root
rd_LVM_LV=vg_chili0/lv_root rd_LVM_LV=vg_chili0/lv_swap rd_NO_LUKS
rd_NO_MD rd_NO_DM LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16
KEYBOARDTYPE=pc KEYTABLE=fr-pc cgroup_disable=memory selinux=0
pcie_aspm=off nmi_watchdog=0 console=ttyS1,115200 maxcpus=1
reset_devices memmap=exactmap memmap=640K@0K memmap=195948K@33408K
elfcorehdr=229356K memmap=308K#1993940K memmap=16K#2077704K
memmap=4K#2077748K memmap=4K#2077764K memmap=44K#2077768K
memmap=72K#2077812K memmap=4K#2077884K memmap=4K#2077888K
memmap=4K#2077892K memmap=4K#2078024K memmap=2716K#2078052K
memmap=1024K#69204860K memmap=128K#69205884K
KERNEL supported cpus:
Intel GenuineIntel
AMD AuthenticAMD
Centaur CentaurHauls
BIOS-provided physical RAM map:
Here is the configuration :
[root@chili1 ~]# crm configure show
node chili0
node chili1
primitive IPaddr-dhcp ocf:Bull:IPaddr \
params ip="11.1.0.20" \
op monitor on-fail="restart" interval="30" \
meta migration-threshold="1"
primitive IPaddr-dns ocf:Bull:IPaddr \
params ip="11.1.0.21" \
op monitor on-fail="restart" interval="30" \
meta migration-threshold="1"
primitive IPaddr-monitoring-master ocf:Bull:IPaddr \
params ip="11.1.0.22" \
op monitor on-fail="restart" interval="30" \
meta migration-threshold="1"
primitive IPaddr-mysql ocf:Bull:IPaddr \
params ip="11.1.0.23" \
op monitor on-fail="restart" interval="30" \
meta migration-threshold="1"
primitive IPaddr-nfs ocf:Bull:IPaddr \
params ip="11.1.0.24" \
op monitor on-fail="restart" interval="30" \
meta migration-threshold="1"
primitive IPaddr-postgresql ocf:Bull:IPaddr \
params ip="11.1.0.25" \
op monitor on-fail="restart" interval="30" \
meta migration-threshold="1"
primitive IPaddr-tftp ocf:Bull:IPaddr \
params ip="11.1.0.26" \
op monitor on-fail="restart" interval="30" \
meta migration-threshold="1"
primitive dhcp-dhcp-server lsb:dhcpd \
op start interval="0" timeout="120" \
op stop interval="0" timeout="120" \
op monitor interval="20" timeout="60" on-fail="restart"
start-delay="60" \
meta migration-threshold="1"
primitive dlm ocf:pacemaker:controld \
op monitor interval="120s"
primitive dns-dns-server lsb:named \
op start interval="0" timeout="120" \
op stop interval="0" timeout="120" \
op monitor interval="20" timeout="60" on-fail="restart"
start-delay="60" \
meta migration-threshold="1"
primitive fs-BCM-MCO ocf:Bull:Filesystem \
params device="-L HA_MNGT:MCO" directory="/BCM/MCO"
fstype="ocfs2" \
op start interval="0" timeout="60" \
op stop interval="0" timeout="60" \
op monitor interval="20" timeout="40"
primitive fs-BCM-conf ocf:Bull:Filesystem \
params device="-L HA_MNGT:CONF" directory="/BCM/conf"
fstype="ocfs2" \
op start interval="0" timeout="60" \
op stop interval="0" timeout="60" \
op monitor interval="20" timeout="40"
primitive fs-BCM-console ocf:Bull:Filesystem \
params device="-L HA_MNGT:CONSOLE" directory="/BCM/console"
fstype="ocfs2" \
op start interval="0" timeout="60" \
op stop interval="0" timeout="60" \
op monitor interval="20" timeout="40"
primitive fs-BCM-data ocf:Bull:Filesystem \
params device="-L HA_MNGT:RRDDBs" directory="/BCM/data"
fstype="ocfs2" \
op start interval="0" timeout="60" \
op stop interval="0" timeout="60" \
op monitor interval="20" timeout="40"
primitive fs-BCM-log ocf:Bull:Filesystem \
params device="-L HA_MNGT:LOGs" directory="/BCM/log"
fstype="ocfs2" \
op start interval="0" timeout="60" \
op stop interval="0" timeout="60" \
op monitor interval="20" timeout="40"
primitive fs-BCM-storage ocf:Bull:Filesystem \
params device="-L HA_MNGT:STORAGE" directory="/BCM/storage"
fstype="ocfs2" \
op start interval="0" timeout="60" \
op stop interval="0" timeout="60" \
op monitor interval="20" timeout="40"
primitive monitoring-master-errorManager lsb:errorManager \
op start interval="0" timeout="120" \
op stop interval="0" timeout="120" \
op monitor interval="20" timeout="60" on-fail="restart"
start-delay="60" \
meta migration-threshold="1"
primitive monitoring-master-eventManager lsb:eventManager \
op start interval="0" timeout="120" \
op stop interval="0" timeout="120" \
op monitor interval="20" timeout="60" on-fail="restart"
start-delay="60" \
meta migration-threshold="1"
primitive monitoring-master-nagios lsb:nagios \
op start interval="0" timeout="120" \
op stop interval="0" timeout="120" \
op monitor interval="20" timeout="60" on-fail="restart"
start-delay="60" \
meta migration-threshold="1"
primitive monitoring-master-powerManager lsb:powerManager \
op start interval="0" timeout="120" \
op stop interval="0" timeout="120" \
op monitor interval="20" timeout="60" on-fail="restart"
start-delay="60" \
meta migration-threshold="1"
primitive monitoring-master-syslog-ng lsb:syslog-ng-monitoring \
op start interval="0" timeout="120" \
op stop interval="0" timeout="120" \
op monitor interval="20" timeout="60" on-fail="restart"
start-delay="60" \
meta migration-threshold="1"
primitive mysql-fs-DBs ocf:Bull:Filesystem \
params device="-L HA_MNGT:MYSQLDBs" directory="/var/lib/mysql"
fstype="ocfs2" \
op start interval="0" timeout="60" \
op stop interval="0" timeout="60" \
op monitor interval="20" timeout="40"
primitive mysql-mysqld ocf:heartbeat:mysql \
params binary="/usr/bin/mysqld_safe"
pid="/var/run/mysqld/mysqld.pid" \
op start interval="0" timeout="" 120 \
op stop interval="0" timeout="120" \
op monitor interval="20" timeout="60" on-fail="restart"
start-delay="60" \
meta migration-threshold="1"
primitive nfs-nfs-server ocf:heartbeat:nfsserver \
params nfs_init_script="/etc/init.d/nfs"
nfs_notify_cmd="/usr/sbin/sm-notify"
nfs_shared_infodir="/BCM/log/nfs-server-logs" nfs_ip="11.1.0.24" \
op start interval="0" timeout="120" \
op stop interval="0" timeout="120" \
op monitor interval="20" timeout="60" on-fail="restart"
start-delay="60"
primitive o2cb ocf:ocfs2:o2cb \
op monitor interval="120s"
primitive postgresql-clusterdb ocf:heartbeat:pgsql \
params pgdata="/var/lib/pgsql/data" \
op start interval="0" timeout="120" \
op stop interval="0" timeout="120" \
op monitor interval="20" timeout="60" on-fail="restart"
start-delay="60" \
meta migration-threshold="1"
primitive postgresql-fs-DBs ocf:Bull:Filesystem \
params device="-L HA_MNGT:PGSQLDBs"
directory="/var/lib/pgsql/data" fstype="ocfs2" \
op start interval="0" timeout="60" \
op stop interval="0" timeout="60" \
op monitor interval="20" timeout="40"
primitive restofencechili0 stonith:fence_ipmilan \
params ipaddr="11.1.0.10" login="super" passwd="pass"
pcmk_host_check="none" action="diag" \
meta target-role="Stopped"
primitive restofencechili1 stonith:fence_ipmilan \
params ipaddr="11.1.0.11" login="super" passwd="pass"
pcmk_host_check="none" action="diag" \
meta target-role="Stopped"
primitive syslog-ng-syslog-ng lsb:hasyslog-ng \
op start interval="0" timeout="60" \
op stop interval="0" timeout="60" \
op monitor interval="20" timeout="40" on-fail="restart" \
meta migration-threshold="3"
primitive tftp-tftp-server lsb:xinetd \
op start interval="0" timeout="120" \
op stop interval="0" timeout="120" \
op monitor interval="20" timeout="60" on-fail="restart"
start-delay="60" \
meta migration-threshold="1"
group dhcp IPaddr-dhcp dhcp-dhcp-server \
meta target-role="Started" migration-threshold="1"
group dns IPaddr-dns dns-dns-server \
meta target-role="Started" migration-threshold="1"
group monitoring-master IPaddr-monitoring-master
monitoring-master-syslog-ng monitoring-master-nagios
monitoring-master-errorManager monitoring-master-eventManager
monitoring-master-powerManager \
meta target-role="Started" migration-threshold="1"
group mysql IPaddr-mysql mysql-mysqld \
meta target-role="Started" migration-threshold="1"
group nfs IPaddr-nfs nfs-nfs-server \
meta target-role="Started" migration-threshold="1"
group postgresql IPaddr-postgresql postgresql-clusterdb \
meta target-role="Started" migration-threshold="1"
group tftp IPaddr-tftp tftp-tftp-server \
meta target-role="Started" migration-threshold="1"
clone clone-dlm dlm \
meta target-role="Started" globally-unique="false"
interleave="true"
clone clone-fs-BCM-MCO fs-BCM-MCO \
meta interleave="true" ordered="false" true
target-role="Started" \
meta target-role="Started"
clone clone-fs-BCM-conf fs-BCM-conf \
meta interleave="true" ordered="false" true
target-role="Started" \
meta target-role="Started"
clone clone-fs-BCM-console fs-BCM-console \
meta interleave="true" ordered="false" true
target-role="Started" \
meta target-role="Started"
clone clone-fs-BCM-data fs-BCM-data \
meta interleave="true" ordered="false" true
target-role="Started" \
meta target-role="Started"
clone clone-fs-BCM-log fs-BCM-log \
meta interleave="true" ordered="false" true
target-role="Started" \
meta target-role="Started"
clone clone-fs-BCM-storage fs-BCM-storage \
meta interleave="true" ordered="false" true
target-role="Started" \
meta target-role="Started"
clone clone-mysql-fs-DBs mysql-fs-DBs \
meta interleave="true" ordered="false" true
target-role="Started" \
meta target-role="Started"
clone clone-o2cb o2cb \
meta target-role="Started" globally-unique="false"
interleave="true"
clone clone-postgresql-fs-DBs postgresql-fs-DBs \
meta interleave="true" ordered="false" true
target-role="Started" \
meta target-role="Started"
clone clone-syslog-ng syslog-ng-syslog-ng \
meta interleave="true" ordered="false" target-role="Stopped" \
meta target-role="Stopped"
location forbiddenloc-restofencechili0 restofencechili0 -inf: chili0
location forbiddenloc-restofencechili1 restofencechili1 -inf: chili1
location loc1-group-dhcp dhcp +100: chili0
location loc1-group-dns dns +100: chili1
location loc1-group-monitoring-master monitoring-master +100: chili0
location loc1-group-mysql mysql +100: chili1
location loc1-group-nfs nfs +100: chili1
location loc1-group-postgresql postgresql +100: chili1
location loc1-group-tftp tftp +100: chili0
location loc1-restofencechili0 restofencechili0 +inf: chili1
location loc1-restofencechili1 restofencechili1 +inf: chili0
colocation coloc-clone-fs-BCM-MCO-o2cb inf: clone-fs-BCM-MCO clone-o2cb
colocation coloc-clone-fs-BCM-conf-o2cb inf: clone-fs-BCM-conf clone-o2cb
colocation coloc-clone-fs-BCM-console-o2cb inf: clone-fs-BCM-console
clone-o2cb
colocation coloc-clone-fs-BCM-data-o2cb inf: clone-fs-BCM-data clone-o2cb
colocation coloc-clone-fs-BCM-log-o2cb inf: clone-fs-BCM-log clone-o2cb
colocation coloc-clone-fs-BCM-storage-o2cb inf: clone-fs-BCM-storage
clone-o2cb
colocation coloc-clone-mysql-fs-DBs-o2cb inf: clone-mysql-fs-DBs
clone-o2cb
colocation coloc-clone-postgresql-fs-DBs-o2cb inf:
clone-postgresql-fs-DBs clone-o2cb
colocation coloc-fs-BCM-MCO-monitoring-master +inf: monitoring-master
clone-fs-BCM-MCO
colocation coloc-fs-BCM-MCO-nfs +inf: nfs clone-fs-BCM-MCO
colocation coloc-fs-BCM-conf-monitoring-master +inf: monitoring-master
clone-fs-BCM-conf
colocation coloc-fs-BCM-conf-nfs +inf: nfs clone-fs-BCM-conf
colocation coloc-fs-BCM-console-nfs +inf: nfs clone-fs-BCM-console
colocation coloc-fs-BCM-data-monitoring-master +inf: monitoring-master
clone-fs-BCM-data
colocation coloc-fs-BCM-data-nfs +inf: nfs clone-fs-BCM-data
colocation coloc-fs-BCM-log-monitoring-master +inf: monitoring-master
clone-fs-BCM-log
colocation coloc-fs-BCM-log-nfs +inf: nfs clone-fs-BCM-log
colocation coloc-mysql-fs-DBs-mysql +inf: mysql clone-mysql-fs-DBs
colocation coloc-postgresql-fs-DBs-postgresql +inf: postgresql
clone-postgresql-fs-DBs
colocation o2cb-with-dlm inf: clone-o2cb clone-dlm
order order-clone-fs-BCM-MCO-o2cb inf: clone-o2cb clone-fs-BCM-MCO
order order-clone-fs-BCM-conf-o2cb inf: clone-o2cb clone-fs-BCM-conf
order order-clone-fs-BCM-console-o2cb inf: clone-o2cb
clone-fs-BCM-console
order order-clone-fs-BCM-data-o2cb inf: clone-o2cb clone-fs-BCM-data
order order-clone-fs-BCM-log-o2cb inf: clone-o2cb clone-fs-BCM-log
order order-clone-fs-BCM-storage-o2cb inf: clone-o2cb
clone-fs-BCM-storage
order order-clone-mysql-fs-DBs-o2cb inf: clone-o2cb clone-mysql-fs-DBs
order order-clone-postgresql-fs-DBs-o2cb inf: clone-o2cb
clone-postgresql-fs-DBs
order order-monitoring-master inf: clone-fs-BCM-MCO clone-fs-BCM-log
clone-fs-BCM-data clone-fs-BCM-conf monitoring-master
order order-mysql inf: clone-mysql-fs-DBs mysql
order order-nfs inf: clone-fs-BCM-console clone-fs-BCM-MCO
clone-fs-BCM-log clone-fs-BCM-data clone-fs-BCM-conf nfs
order order-postgresql inf: clone-postgresql-fs-DBs postgresql
order start-o2cb-after-dlm inf: clone-dlm clone-o2cb
property $id="cib-bootstrap-options" \
dc-version="1.1.2-c6b59218ee949eebff30e837ff6f3824ed0ab86b" \
cluster-infrastructure="openais" \
expected-quorum-votes="2" \
stonith-enabled="true" \
no-quorum-policy="ignore" \
default-resource-stickiness="5000" \
last-lrm-refresh="1286452453"
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster