Tcmalloc on arm7 is problematic. You need to compile your own with either jemalloc or just libc malloc
/Torben
Den 20. maj 2019 17.48.40 CEST, "Jesper Taxbøl" <jesper@xxxxxxxxxx> skrev:
I am trying to setup a Ceph cluster on 4 odroid-hc2 instances on top of Ubuntu 18.04.My ceph-mgr deamon keeps crashing on me.Any advise on how to proceed?Log on mgr node says something about ms_dispatch:2019-05-20 15:34:43.070424 b6714230 0 set uid:gid to 64045:64045 (ceph:ceph)
2019-05-20 15:34:43.070455 b6714230 0 ceph version 12.2.11 (26dc3775efc7bb286a1d6d66faee0b
a30ea23eee) luminous (stable), process ceph-mgr, pid 1169
2019-05-20 15:34:43.070799 b6714230 0 pidfile_write: ignore empty --pid-file
2019-05-20 15:34:43.101162 b6714230 1 mgr send_beacon standby
2019-05-20 15:34:43.124462 b06f8c30 -1 *** Caught signal (Segmentation fault) **
in thread b06f8c30 thread_name:ms_dispatch
ceph version 12.2.11 (26dc3775efc7bb286a1d6d66faee0ba30ea23eee) luminous (stable)
1: (()+0x30133c) [0x77033c]
2: (()+0x25750) [0xb688a750]
3: (_ULarm_step()+0x55) [0xb6816ce6]
4: (()+0x255e8) [0xb6cd85e8]
5: (GetStackTrace(void**, int, int)+0x25) [0xb6cd8a3e]
6: (tcmalloc::PageHeap::GrowHeap(unsigned int)+0xb9) [0xb6ccd36a]
7: (tcmalloc::PageHeap::New(unsigned int)+0x79) [0xb6ccd5e6]
8: (tcmalloc::CentralFreeList::Populate()+0x71) [0xb6ccc5ce]
9: (tcmalloc::CentralFreeList::FetchFromOneSpansSafe(int, void**, void**)+0x1b) [0xb6ccc76
0]
10: (tcmalloc::CentralFreeList::RemoveRange(void**, void**, int)+0x6d) [0xb6ccc7de]
11: (tcmalloc::ThreadCache::FetchFromCentralCache(unsigned int, unsigned int)+0x51) [0xb6c
cea56]
12: (malloc()+0x22d) [0xb6cd9a8e]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this
.
--- begin dump of recent events ---
-90> 2019-05-20 15:34:43.053293 b6714230 5 asok(0x55b5320) register_command perfcounter
s_dump hook 0x554c088
-89> 2019-05-20 15:34:43.053322 b6714230 5 asok(0x55b5320) register_command 1 hook 0x55
4c088
-88> 2019-05-20 15:34:43.053330 b6714230 5 asok(0x55b5320) register_command perf dump h
ook 0x554c088
-87> 2019-05-20 15:34:43.053341 b6714230 5 asok(0x55b5320) register_command perfcounter
s_schema hook 0x554c088
-86> 2019-05-20 15:34:43.053360 b6714230 5 asok(0x55b5320) register_command perf histog
ram dump hook 0x554c088
-85> 2019-05-20 15:34:43.053374 b6714230 5 asok(0x55b5320) register_command 2 hook 0x55
4c088
-84> 2019-05-20 15:34:43.053381 b6714230 5 asok(0x55b5320) register_command perf schema
hook 0x554c088
-83> 2019-05-20 15:34:43.053389 b6714230 5 asok(0x55b5320) register_command perf histog
ram schema hook 0x554c088
-82> 2019-05-20 15:34:43.053410 b6714230 5 asok(0x55b5320) register_command perf reset
hook 0x554c088
-81> 2019-05-20 15:34:43.053418 b6714230 5 asok(0x55b5320) register_command config show
hook 0x554c088
-80> 2019-05-20 15:34:43.053425 b6714230 5 asok(0x55b5320) register_command config help
hook 0x554c088
-79> 2019-05-20 15:34:43.053436 b6714230 5 asok(0x55b5320) register_command config set
hook 0x554c088
-78> 2019-05-20 15:34:43.053444 b6714230 5 asok(0x55b5320) register_command config get
hook 0x554c088
-77> 2019-05-20 15:34:43.053459 b6714230 5 asok(0x55b5320) register_command config diff
hook 0x554c088
-76> 2019-05-20 15:34:43.053467 b6714230 5 asok(0x55b5320) register_command config diff
get hook 0x554c088
-75> 2019-05-20 15:34:43.053475 b6714230 5 asok(0x55b5320) register_command log flush h
ook 0x554c088
-74> 2019-05-20 15:34:43.053482 b6714230 5 asok(0x55b5320) register_command log dump ho
ok 0x554c088
-73> 2019-05-20 15:34:43.053490 b6714230 5 asok(0x55b5320) register_command log reopen
hook 0x554c088
-72> 2019-05-20 15:34:43.053513 b6714230 5 asok(0x55b5320) register_command dump_mempoo
ls hook 0x56e3504
-71> 2019-05-20 15:34:43.070424 b6714230 0 set uid:gid to 64045:64045 (ceph:ceph)
-70> 2019-05-20 15:34:43.070455 b6714230 0 ceph version 12.2.11 (26dc3775efc7bb286a1d6d
66faee0ba30ea23eee) luminous (stable), process ceph-mgr, pid 1169
-69> 2019-05-20 15:34:43.070799 b6714230 0 pidfile_write: ignore empty --pid-file
-68> 2019-05-20 15:34:43.074441 b6714230 5 asok(0x55b5320) init /var/run/ceph/ceph-mgr.
odroid-c.asok
-67> 2019-05-20 15:34:43.074473 b6714230 5 asok(0x55b5320) bind_and_listen /var/run/cep
h/ceph-mgr.odroid-c.asok
-66> 2019-05-20 15:34:43.074615 b6714230 5 asok(0x55b5320) register_command 0 hook 0x55
4c1d0
-65> 2019-05-20 15:34:43.074633 b6714230 5 asok(0x55b5320) register_command version hoo
k 0x554c1d0
-64> 2019-05-20 15:34:43.074654 b6714230 5 asok(0x55b5320) register_command git_version
hook 0x554c1d0
-63> 2019-05-20 15:34:43.074674 b6714230 5 asok(0x55b5320) register_command help hook 0
x554c1d8
-62> 2019-05-20 15:34:43.074694 b6714230 5 asok(0x55b5320) register_command get_command
_descriptions hook 0x554c1e0
-61> 2019-05-20 15:34:43.074785 b3effc30 5 asok(0x55b5320) entry start
-60> 2019-05-20 15:34:43.076464 b36fec30 2 Event(0x554e068 nevent=5000 time_id=1).set_o
wner idx=0 owner=3010456624
-59> 2019-05-20 15:34:43.076559 b2efdc30 2 Event(0x554e488 nevent=5000 time_id=1).set_o
wner idx=1 owner=3002063920
-58> 2019-05-20 15:34:43.076643 b26fcc30 2 Event(0x554e1c8 nevent=5000 time_id=1).set_o
wner idx=2 owner=2993671216
-57> 2019-05-20 15:34:43.077177 b6714230 1 Processor -- start
-56> 2019-05-20 15:34:43.077298 b6714230 1 -- - start start
-55> 2019-05-20 15:34:43.077315 b6714230 10 monclient: build_initial_monmap
-54> 2019-05-20 15:34:43.077362 b6714230 10 monclient: init
-53> 2019-05-20 15:34:43.077380 b6714230 5 adding auth protocol: cephx
-52> 2019-05-20 15:34:43.077391 b6714230 10 monclient: auth_supported 2 method cephx
-51> 2019-05-20 15:34:43.077625 b6714230 2 auth: KeyRing::load: loaded key file /var/li
b/ceph/mgr/ceph-odroid-c/keyring
-50> 2019-05-20 15:34:43.077761 b6714230 10 monclient: _reopen_session rank -1
-49> 2019-05-20 15:34:43.077847 b6714230 10 monclient(hunting): picked mon.noname-a con
0x5792d00 addr 192.168.130.131:6789/0
-48> 2019-05-20 15:34:43.077899 b6714230 1 -- - --> 192.168.130.131:6789/0 -- auth(prot
o 0 33 bytes epoch 0) v1 -- 0x5590680 con 0
-47> 2019-05-20 15:34:43.077985 b6714230 10 monclient(hunting): _renew_subs
-46> 2019-05-20 15:34:43.080980 b2efdc30 1 -- 192.168.130.132:0/2049423493 learned_addr
learned my addr 192.168.130.132:0/2049423493
-45> 2019-05-20 15:34:43.082020 b2efdc30 2 -- 192.168.130.132:0/2049423493 >> 192.168.1
30.131:6789/0 conn(0x5792d00 :-1 s=STATE_CONNECTING_WAIT_ACK_SEQ pgs=0 cs=0 l=0)._process_c
onnection got newly_acked_seq 0 vs out_seq 0
-44> 2019-05-20 15:34:43.084528 b2efdc30 5 -- 192.168.130.132:0/2049423493 >> 192.168.1
30.131:6789/0 conn(0x5792d00 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=45 cs=1
l=1). rx mon.0 seq 1 0x55aa900 mon_map magic: 0 v1
-43> 2019-05-20 15:34:43.084615 b06f8c30 1 -- 192.168.130.132:0/2049423493 <== mon.0 19
2.168.130.131:6789/0 1 ==== mon_map magic: 0 v1 ==== 196+0+0 (1694575244 0 0) 0x55aa900 con
0x5792d00
-42> 2019-05-20 15:34:43.084656 b06f8c30 10 monclient(hunting): handle_monmap mon_map ma
gic: 0 v1
-41> 2019-05-20 15:34:43.084685 b06f8c30 10 monclient(hunting): got monmap 1, mon.nonam
e-a is now rank -1
-40> 2019-05-20 15:34:43.084698 b06f8c30 10 monclient(hunting): dump:
epoch 1
fsid 75cb9a2d-673b-4a32-897a-05470a08ed58
last_changed 2019-05-20 15:02:53.998735
created 2019-05-20 15:02:53.998735
0: 192.168.130.131:6789/0 mon.odroid-b
-39> 2019-05-20 15:34:43.084956 b2efdc30 5 -- 192.168.130.132:0/2049423493 >> 192.168.1
30.131:6789/0 conn(0x5792d00 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=45 cs=1
l=1). rx mon.0 seq 2 0x55a0540 auth_reply(proto 2 0 (0) Success) v1
-38> 2019-05-20 15:34:43.085011 b06f8c30 1 -- 192.168.130.132:0/2049423493 <== mon.0 19
2.168.130.131:6789/0 2 ==== auth_reply(proto 2 0 (0) Success) v1 ==== 33+0+0 (4086221156 0
0) 0x55a0540 con 0x5792d00
-37> 2019-05-20 15:34:43.085053 b06f8c30 10 monclient(hunting): my global_id is 24139
-36> 2019-05-20 15:34:43.085175 b06f8c30 1 -- 192.168.130.132:0/2049423493 --> 192.168.
130.131:6789/0 -- auth(proto 2 32 bytes epoch 0) v1 -- 0x5590d00 con 0
-35> 2019-05-20 15:34:43.088488 b2efdc30 5 -- 192.168.130.132:0/2049423493 >> 192.168.1
30.131:6789/0 conn(0x5792d00 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=45 cs=1
l=1). rx mon.0 seq 3 0x55a0700 auth_reply(proto 2 0 (0) Success) v1
-34> 2019-05-20 15:34:43.088712 b06f8c30 1 -- 192.168.130.132:0/2049423493 <== mon.0 19
2.168.130.131:6789/0 3 ==== auth_reply(proto 2 0 (0) Success) v1 ==== 222+0+0 (1945430716 0
0) 0x55a0700 con 0x5792d00
-33> 2019-05-20 15:34:43.089295 b06f8c30 1 -- 192.168.130.132:0/2049423493 --> 192.168.
130.131:6789/0 -- auth(proto 2 181 bytes epoch 0) v1 -- 0x5590680 con 0
-32> 2019-05-20 15:34:43.097488 b2efdc30 5 -- 192.168.130.132:0/2049423493 >> 192.168.1
30.131:6789/0 conn(0x5792d00 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=45 cs=1
l=1). rx mon.0 seq 4 0x55a08c0 auth_reply(proto 2 0 (0) Success) v1
-31> 2019-05-20 15:34:43.097643 b06f8c30 1 -- 192.168.130.132:0/2049423493 <== mon.0 19
2.168.130.131:6789/0 4 ==== auth_reply(proto 2 0 (0) Success) v1 ==== 783+0+0 (327382700 0
0) 0x55a08c0 con 0x5792d00
-30> 2019-05-20 15:34:43.098725 b06f8c30 1 monclient: found mon.odroid-b
-29> 2019-05-20 15:34:43.098850 b06f8c30 10 monclient: _send_mon_message to mon.odroid-b
at 192.168.130.131:6789/0
-28> 2019-05-20 15:34:43.098898 b06f8c30 1 -- 192.168.130.132:0/2049423493 --> 192.168.
130.131:6789/0 -- mon_subscribe({mgrmap=0+,monmap=0+}) v2 -- 0x554eb00 con 0
-27> 2019-05-20 15:34:43.099042 b06f8c30 10 monclient: _check_auth_rotating renewing rot
ating keys (they expired before 2019-05-20 15:34:13.099036)
-26> 2019-05-20 15:34:43.099183 b06f8c30 10 monclient: _send_mon_message to mon.odroid-b
at 192.168.130.131:6789/0
-25> 2019-05-20 15:34:43.099271 b06f8c30 1 -- 192.168.130.132:0/2049423493 --> 192.168.
130.131:6789/0 -- auth(proto 2 2 bytes epoch 0) v1 -- 0x5590d00 con 0
-24> 2019-05-20 15:34:43.099404 b6714230 5 monclient: authenticate success, global_id 2
4139
-23> 2019-05-20 15:34:43.099543 b6714230 10 log_channel(cluster) update_config to_monito
rs: true to_syslog: false syslog_facility: daemon prio: info to_graylog: false graylog_host
: 127.0.0.1 graylog_port: 12201)
-22> 2019-05-20 15:34:43.099602 b6714230 10 log_channel(audit) update_config to_monitors
: true to_syslog: false syslog_facility: local0 prio: info to_graylog: false graylog_host:
127.0.0.1 graylog_port: 12201)
-21> 2019-05-20 15:34:43.099970 b6714230 5 asok(0x55b5320) register_command objecter_re
quests hook 0x554c238
-20> 2019-05-20 15:34:43.100171 b6714230 10 monclient: _renew_subs
-19> 2019-05-20 15:34:43.100214 b6714230 10 monclient: _send_mon_message to mon.odroid-b
at 192.168.130.131:6789/0
-18> 2019-05-20 15:34:43.100246 b6714230 1 -- 192.168.130.132:0/2049423493 --> 192.168.
130.131:6789/0 -- mon_subscribe({osdmap=0}) v2 -- 0x554ec60 con 0
-17> 2019-05-20 15:34:43.100737 b6714230 5 asok(0x55b5320) register_command mds_request
s hook 0xbefefe80
-16> 2019-05-20 15:34:43.100793 b6714230 5 asok(0x55b5320) register_command mds_session
s hook 0xbefefe80
-15> 2019-05-20 15:34:43.100847 b6714230 5 asok(0x55b5320) register_command dump_cache
hook 0xbefefe80
-14> 2019-05-20 15:34:43.100811 b2efdc30 5 -- 192.168.130.132:0/2049423493 >> 192.168.1
30.131:6789/0 conn(0x5792d00 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=45 cs=1
l=1). rx mon.0 seq 5 0x558dc00 mgrmap(e 99) v1
-13> 2019-05-20 15:34:43.100915 b6714230 5 asok(0x55b5320) register_command kick_stale_
sessions hook 0xbefefe80
-12> 2019-05-20 15:34:43.100977 b6714230 5 asok(0x55b5320) register_command status hook
0xbefefe80
-11> 2019-05-20 15:34:43.100987 b06f8c30 1 -- 192.168.130.132:0/2049423493 <== mon.0 19
2.168.130.131:6789/0 5 ==== mgrmap(e 99) v1 ==== 232+0+0 (4078310027 0 0) 0x558dc00 con 0x5
792d00
-10> 2019-05-20 15:34:43.101004 b2efdc30 5 -- 192.168.130.132:0/2049423493 >> 192.168.1
30.131:6789/0 conn(0x5792d00 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=45 cs=1
l=1). rx mon.0 seq 6 0x55aaa80 mon_map magic: 0 v1
-9> 2019-05-20 15:34:43.101162 b6714230 1 mgr send_beacon standby
-8> 2019-05-20 15:34:43.101575 b2efdc30 5 -- 192.168.130.132:0/2049423493 >> 192.168.1
30.131:6789/0 conn(0x5792d00 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=45 cs=1
l=1). rx mon.0 seq 7 0x55a0540 auth_reply(proto 2 0 (0) Success) v1
-7> 2019-05-20 15:34:43.101889 b2efdc30 5 -- 192.168.130.132:0/2049423493 >> 192.168.1
30.131:6789/0 conn(0x5792d00 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=45 cs=1
l=1). rx mon.0 seq 8 0x5590d00 osd_map(42..42 src has 1..42) v3
-6> 2019-05-20 15:34:43.102775 b6714230 10 monclient: _send_mon_message to mon.odroid-b
at 192.168.130.131:6789/0
-5> 2019-05-20 15:34:43.102838 b6714230 1 -- 192.168.130.132:0/2049423493 --> 192.168.
130.131:6789/0 -- mgrbeacon mgr.odroid-c(75cb9a2d-673b-4a32-897a-05470a08ed58,24139, -, 0)
v6 -- 0x5562400 con 0
-4> 2019-05-20 15:34:43.102991 b6714230 4 mgr init Complete.
-3> 2019-05-20 15:34:43.103065 b06f8c30 4 mgr ms_dispatch standby mgrmap(e 99) v1
-2> 2019-05-20 15:34:43.103110 b06f8c30 4 mgr handle_mgr_map received map epoch 99
-1> 2019-05-20 15:34:43.103128 b06f8c30 4 mgr handle_mgr_map active in map: 0 active i
s 24134
0> 2019-05-20 15:34:43.124462 b06f8c30 -1 *** Caught signal (Segmentation fault) **
in thread b06f8c30 thread_name:ms_dispatch
ceph version 12.2.11 (26dc3775efc7bb286a1d6d66faee0ba30ea23eee) luminous (stable)
1: (()+0x30133c) [0x77033c]
2: (()+0x25750) [0xb688a750]
3: (_ULarm_step()+0x55) [0xb6816ce6]
4: (()+0x255e8) [0xb6cd85e8]
5: (GetStackTrace(void**, int, int)+0x25) [0xb6cd8a3e]
6: (tcmalloc::PageHeap::GrowHeap(unsigned int)+0xb9) [0xb6ccd36a]
7: (tcmalloc::PageHeap::New(unsigned int)+0x79) [0xb6ccd5e6]
8: (tcmalloc::CentralFreeList::Populate()+0x71) [0xb6ccc5ce]
9: (tcmalloc::CentralFreeList::FetchFromOneSpansSafe(int, void**, void**)+0x1b) [0xb6ccc76
0]
10: (tcmalloc::CentralFreeList::RemoveRange(void**, void**, int)+0x6d) [0xb6ccc7de]
11: (tcmalloc::ThreadCache::FetchFromCentralCache(unsigned int, unsigned int)+0x51) [0xb6c
cea56]
12: (malloc()+0x22d) [0xb6cd9a8e]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this
.
--- logging levels ---
0/ 5 none
0/ 1 lockdep
0/ 1 context
1/ 1 crush
1/ 5 mds
1/ 5 mds_balancer
1/ 5 mds_locker
1/ 5 mds_log
1/ 5 mds_log_expire
1/ 5 mds_migrator
0/ 1 buffer
0/ 1 timer
0/ 1 filer
0/ 1 striper
0/ 1 objecter
0/ 5 rados
0/ 5 rbd
0/ 5 rbd_mirror
0/ 5 rbd_replay
0/ 5 journaler
0/ 5 objectcacher
0/ 5 client
1/ 5 osd
0/ 5 optracker
0/ 5 objclass
1/ 3 filestore
1/ 3 journal
0/ 5 ms
1/ 5 mon
0/10 monc
1/ 5 paxos
0/ 5 tp
1/ 5 auth
1/ 5 crypto
1/ 1 finisher
1/ 1 reserver
1/ 5 heartbeatmap
1/ 5 perfcounter
1/ 5 rgw
1/10 civetweb
1/ 5 javaclient
1/ 5 asok
1/ 1 throttle
0/ 0 refs
1/ 5 xio
1/ 5 compressor
1/ 5 bluestore
1/ 5 bluefs
1/ 3 bdev
1/ 5 kstore
4/ 5 rocksdb
4/ 5 leveldb
4/ 5 memdb
1/ 5 kinetic
1/ 5 fuse
1/ 5 mgr
1/ 5 mgrc
1/ 5 dpdk
1/ 5 eventtrace
-2/-2 (syslog threshold)
-1/-1 (stderr threshold)
max_recent 10000
max_new 1000
log_file /var/log/ceph/ceph-mgr.odroid-c.log
--- end dump of recent events ---
Kind regardsJesper
--
Dette er sendt fra min mobiltelefon. Undskyld at jeg fatter mig i korthed.
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com