Re: Segfault when creating new cluster

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi List!

I have tracked down the bad commit to de640d85fa3e0e5e5a31704eab5a8714a1ffe867.

I have also created a patch that fixes this error on my test cluster. I am attaching it here for peer-review.

---
Thanks,
Dyweni



On Sat, 14 May 2011 19:17:42 -0500, Dyweni - Ceph-Devel wrote:

Hi List!

When creating a brand new cluster, I get the following segmentation
fault:

=== osd.2 ===
pushing conf and monmap to ceph2
Warning: Permanently added 'ceph2' (ECDSA) to the list of known hosts.
umount: /data/osd2: not mounted
umount: /dev/sda: not mounted

WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTAL
WARNING! - see http://btrfs.wiki.kernel.org [1] before using

fs created label (null) on /dev/sda
nodesize 4096 leafsize 4096 sectorsize 4096 size 74.53GB
Btrfs Btrfs v0.19
Scanning for Btrfs filesystems
** WARNING: Ceph is still under development. Any feedback can be
directed **
** at ceph-devel@xxxxxxxxxxxxxxx [2] or
http://ceph.newdream.net/ [3]. **
*** Caught signal (Segmentation fault) **
in thread 0xb70f2b30
ceph version 0.27.1-401-g6af0379
(commit:6af0379e27ac71a7abd8c9ebb0145ae8b9f66cc4)
1: (ceph::BackTrace::BackTrace(int)+0x1f) [0x8465fcf]
2: /usr/bin/cosd() [0x84d8844]
3: [0xb77f1400]
4: (pthread_spin_lock()+0x6) [0xb77c38d6]
5: (ceph::Spinlock::lock()+0x20) [0x82e42e8]
6: (ceph::atomic_t::dec()+0x12) [0x82e4418]
7: (RefCountedObject::put()+0x15) [0x82e48d9]
8: (MonClient::get_monmap_privately()+0x5f2) [0x84c81ec]
9: (main()+0x976) [0x82e0cce]
10: (__libc_start_main()+0xd9) [0xb7109ba9]
11: /usr/bin/cosd() [0x82e0101]
/usr/sbin/mkcephfs: line 239: 859 Segmentation fault (core
dumped) $BINDIR/cosd -c $conf --monmap $dir/monmap -i $id --mkfs
failed: 'ssh ceph2 /usr/sbin/mkcephfs -d /tmp/mkcephfs.6ySmaVjdFm
--init-daemon osd.2'

Here is the GDB backtrace:

(gdb) bt
#0 0xb77c6d6f in raise () from /lib/libpthread.so.0
#1 0x084d870f in reraise_fatal (signum=11) at common/signal.cc:63
#2 0x084d88ce in handle_fatal_signal (signum=11) at
common/signal.cc:110
#3
#4 0xb77c38d6 in pthread_spin_lock () from /lib/libpthread.so.0
#5 0x082e42e8 in ceph::Spinlock::lock (this=0x4) at
include/Spinlock.h:97
#6 0x082e4418 in ceph::atomic_t::dec (this=0x4) at include/atomic.h:75 #7 0x082e48d9 in RefCountedObject::put (this=0x0) at msg/Message.h:160
#8 0x084c81ec in MonClient::get_monmap_privately (this=0xbf81baf4) at
mon/MonClient.cc:230
#9 0x082e0cce in main (argc=8, argv=0xbf81c1f4) at cosd.cc:130

My kernel is:
Linux version 2.6.39-rc7-git5-20110514-0905 (root@phenom) (gcc version
4.4.5 (Gentoo 4.4.5 p1.2, pie-0.4.5) ) #1 SMP Sat May 14 09:07:07 CDT
2011

--
Thanks,
Dyweni

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx [4]
More majordomo info at http://vger.kernel.org/majordomo-info.html [5]

From acf86f21d3c11e8edd82692a4fa27a5b88c538b0 Mon Sep 17 00:00:00 2001
From: root <root@xxxxxxxxxxxxxxxxx>
Date: Sun, 15 May 2011 08:54:13 -0500
Subject: [PATCH] fix segfault introduced by commit de640d85fa3e0e5e5a31704eab5a8714a1ffe867

That commit introduces the line 'cur_con->put()' which has the possibility
of being called while cur_con is not initialized.
---
 src/mon/MonClient.cc |    6 ++++--
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/mon/MonClient.cc b/src/mon/MonClient.cc
index 70e14e9..9707dfe 100644
--- a/src/mon/MonClient.cc
+++ b/src/mon/MonClient.cc
@@ -227,8 +227,10 @@ int MonClient::get_monmap_privately()
   hunting = true;  // reset this to true!
   cur_mon.clear();
 
-  cur_con->put();
-  cur_con = NULL;
+  if (cur_con) {
+    cur_con->put();
+    cur_con = NULL;
+  }
 
   if (monmap.epoch)
     return 0;
-- 
1.7.3.4


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux