Re: crash using ceph-osdomap-tool

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




Kefu,

After doing a make clean, it doesn't crash anymore.  Sorry about that.

David


On 9/5/18 2:39 AM, kefu chai wrote:
On Wed, Sep 5, 2018 at 11:59 AM David Zafman <dzafman@xxxxxxxxxx> wrote:

Kefu,

With a vstart.sh cluster the ceph-osdomap-tool is broken.  It might be
related to change e406d8eb9e1deb801ecb346169eaaf96adbb4b53 which changed
the locking.

David, i tried to reduce this issue in a ubuntu 16.04 docker with GCC
7.3 and debian sid with GCC 8.2 using up-to-date master and
dzafman:wip-23875. none of the 4 combinations crashes. i tested with
following steps:

$ MDS=0 MGR=1 OSD=3 MON=3 ../src/vstart.sh -X -n --filestore
$ bin/init-ceph stop osd.0
$ bin/ceph-osdomap-tool --no-mon-config --omap-path
dev/osd0/current/omap --command dump-objects
Version: 3
Seq: 1
legacy: false

and i also used gdb to launch the executable and set a breakpoint at
ceph_osdomap_tool.cc:80 to make sure that this line is executed. and
it was. is there any specific setting you are using when building
Ceph?

my configure is:
cmake -DCMAKE_BUILD_TYPE=Debug -DWITH_MGR_DASHBOARD_FRONTEND=OFF
-DBOOST_J=8 -DWITH_DPDK=OFF -DWITH_SPDK=OFF -DWITH_SEASTAR=ON
-DENABLE_GIT_VERSION=OFF -DCMAKE_INSTALL_PREFIX:PATH=$HOME/.local
-DWITH_PYTHON3=ON -DMGR_PYTHON_VERSION=3 ..


$ gdb bin/ceph-osdomap-tool
GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.04) 7.11.1
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
<http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from bin/ceph-osdomap-tool...rudone.
n (gdb) run  --no-mon-config --omap-path dev/osd0/current/omap --command
dump-objects
Starting program: /src/ceph/build/bin/ceph-osdomap-tool --no-mon-config
--omap-path dev/osd0/current/omap --command dump-objects
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7fffeaea2700 (LWP 33725)]
ceph-osdomap-tool: ../nptl/pthread_mutex_lock.c:81:
__pthread_mutex_lock: Assertion `mutex->__data.__owner == 0' failed.

Thread 1 "ceph-osdomap-to" received signal SIGABRT, Aborted.
0x00007fffed3eb428 in __GI_raise (sig=sig@entry=6) at
../sysdeps/unix/sysv/linux/raise.c:54
54      ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  0x00007fffed3eb428 in __GI_raise (sig=sig@entry=6) at
../sysdeps/unix/sysv/linux/raise.c:54
#1  0x00007fffed3ed02a in __GI_abort () at abort.c:89
#2  0x00007fffed3e3bd7 in __assert_fail_base (fmt=<optimized out>,
assertion=assertion@entry=0x7fffee446015 "mutex->__data.__owner == 0",
file=file@entry=0x7fffee445ff8 "../nptl/pthread_mutex_lock.c",
line=line@entry=81, function=function@entry=0x7fffee446180
<__PRETTY_FUNCTION__.8623> "__pthread_mutex_lock") at assert.c:92
#3  0x00007fffed3e3c82 in __GI___assert_fail
(assertion=assertion@entry=0x7fffee446015 "mutex->__data.__owner == 0",
file=file@entry=0x7fffee445ff8 "../nptl/pthread_mutex_lock.c",
line=line@entry=81, function=function@entry=0x7fffee446180
<__PRETTY_FUNCTION__.8623> "__pthread_mutex_lock") at assert.c:101
#4  0x00007fffee43cf68 in __GI___pthread_mutex_lock
(mutex=mutex@entry=0x5555567b1620) at ../nptl/pthread_mutex_lock.c:81
#5  0x00007fffee8eef49 in Mutex::Lock (this=this@entry=0x5555567b15f8,
no_lockdep=no_lockdep@entry=false) at
/home/dzafman/ceph/src/common/Mutex.cc:107
#6  0x000055555574e661 in Mutex::Locker::Locker (m=..., this=<synthetic
pointer>) at /home/dzafman/ceph/src/common/Mutex.h:116
#7  ConfigProxy::parse_config_files (flags=0, warnings=<optimized out>,
conf_files=0x0, this=0x5555567ae008) at
/home/dzafman/ceph/src/common/config_proxy.h:199
#8  global_pre_init (defaults=<optimized out>, args=std::vector of
length 1, capacity 1 = {...}, module_type=<optimized out>,
code_env=code_env@entry=CODE_ENVIRONMENT_UTILITY_NODOUT,
flags=flags@entry=0) at /home/dzafman/ceph/src/global/global_init.cc:114
#9  0x000055555574eba7 in global_init (defaults=<optimized out>,
args=..., module_type=<optimized out>,
code_env=CODE_ENVIRONMENT_UTILITY_NODOUT, flags=0, data_dir_option=0x0,
run_pre_init=true) at /home/dzafman/ceph/src/global/global_init.cc:176
#10 0x0000555555643a7d in main (argc=<optimized out>, argv=<optimized
out>) at /home/dzafman/ceph/src/tools/ceph_osdomap_tool.cc:80


commit e406d8eb9e1deb801ecb346169eaaf96adbb4b53
Author: Kefu Chai <kchai@xxxxxxxxxx>
Date:   Sun Jul 15 16:49:59 2018 +0800

      common/config: promote lock from md_config_t to ConfigProxy

      seastar's ConfigProxy and alien's ConfigProxy follow different
threading
      models and expose different methods. the former updates a setting
with 3
      steps:
      1. create a local copy of current setting, and apply the proposed
change
         to the copy
      2. populate the updated change with a foreign_ptr<> to all shards
         (including itself)
      3. on each shards, call apply_changes() to get the interested observers
         updated, please note, apply_changes() should only update the local
         observers on current shard.

      while the alien's ConfigProxy do all the job in a single synchronized
      call,
      but we can split it into a finer-grained steps:
      1. apply the proposed change in-place
      2. apply_changes() to get the interested observers updated.

      so, to reuse the code across these two implementations, for instance,
      set_mon_vals() will be implemented in ConfigProxy instead, so we can
      have different behavior in different ConfigProxy classes. if we keep
      using the existing single-piece md_config_t::set_mon_vals(), we have no
      chance to differentiate the apply_changes() for seastar port. but the
      alien implementation requires a grand lock protecting set_val() and
      apply_changes(), so we have to move the lock from md_config_t up to
      ConfigProxy. it's also simpler this way, as we don't need an extra
layer
      to have a dummy Mutex for seastar's ConfigProxy. as only the alien's
      ConfigProxy requires the lock.

      Signed-off-by: Kefu Chai <kchai@xxxxxxxxxx>

David






[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux