Re: ceph-mon segmentation fault

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[Re-added the list]

On Thu, Feb 20, 2014 at 8:09 AM, Pavel V. Kaygorodov <pasha@xxxxxxxxx> wrote:
> Hi!
>
>> I created a ticket: http://tracker.ceph.com/issues/7487
>>
>> But my guess is that this is a result of having 0 CRUSH weight for the
>> entire tree while linking them up. Can you give the OSD a weight and
>> see if it works after that?
>
> How to do this?
> I still not very familiar with ceph tools yet :)

See http://ceph.com/docs/master/rados/operations/add-or-rm-osds/#adding-osds
In particular you'll want to use "ceph osd reweight <osd-id>
<weight>". (The weight should probably just be 1, or the disk size in
TB, or similar.)
I assumed you were basically following those steps already!
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com

>
> Pavel.
>
>
>> -Greg
>> Software Engineer #42 @ http://inktank.com | http://ceph.com
>>
>>
>> On Tue, Feb 18, 2014 at 4:21 AM, Pavel V. Kaygorodov <pasha@xxxxxxxxx> wrote:
>>> Hi!
>>>
>>> Playing with ceph, I found a bug:
>>>
>>> I have compiled and installed ceph from sources on debian/jessie:
>>>
>>> git clone --recursive -b v0.75 https://github.com/ceph/ceph.git
>>> cd ceph/ && ./autogen.sh && ./configure && make && make install
>>>
>>> /usr/local/bin/ceph-authtool --create-keyring /data/ceph.mon.keyring --gen-key -n mon. --cap mon 'allow *'
>>> /usr/local/bin/ceph-authtool --create-keyring /ceph.client.admin.keyring --gen-key -n client.admin --set-uid=0 --cap mon 'allow *' --cap osd 'allow *' --cap mds 'allow'
>>> /usr/local/bin/ceph-authtool /data/ceph.mon.keyring --import-keyring /ceph.client.admin.keyring
>>> /usr/local/bin/monmaptool --create --fsid e90dfd37-98d1-45bb-a847-8590a5ed8e71 /data/monmap
>>> /usr/local/bin/ceph-mon --mkfs -i ceph-mon.dkctl --monmap /data/monmap --keyring /data/ceph.mon.keyring
>>>
>>> my ceph.conf is (I have configured local TLD dkctl. with ceph-mon A-record):
>>>
>>> [global]
>>>
>>> fsid = e90dfd37-98d1-45bb-a847-8590a5ed8e71
>>> mon initial members = ceph-mon.dkctl
>>>
>>> auth cluster required = cephx
>>> auth service required = cephx
>>> auth client required = cephx
>>>
>>> keyring = /ceph.client.admin.keyring
>>>
>>> osd pool default size = 2
>>> osd pool default min size = 2
>>> osd pool default pg num = 333
>>> osd pool default pgp num = 333
>>> osd crush chooseleaf type = 1
>>> osd journal size = 1000
>>>
>>> filestore xattr use omap = true
>>>
>>> mon host = ceph-mon.dkctl
>>> mon addr = ceph-mon.dkctl
>>>
>>> log file = /data/logs/ceph.log
>>>
>>> [mon]
>>> mon data = /data/mon
>>> keyring = /data/ceph.mon.keyring
>>> log file = /data/logs/mon.log
>>>
>>> [osd.0]
>>> osd host    = osd0
>>> osd data    = /data/osd
>>> osd journal = /data/osd.journal
>>> log file    = /data/logs/osd.log
>>> keyring     = /data/ceph.osd.keyring
>>>
>>> started ceph-mon:
>>>
>>> /usr/local/bin/ceph-mon -c /ceph.conf --public-addr `grep ceph-mon /etc/hosts | awk '{print $1}'` -i ceph-mon.dkctl
>>>
>>> After that following commands crushed ceph-mon daemon:
>>>
>>> root@ceph-mon:/# ceph osd crush add-bucket osd-host host
>>> added bucket osd-host type host to crush map
>>> root@ceph-mon:/# ceph osd crush move osd-host root=default
>>> moved item id -2 name 'osd-host' to location {root=default} in crush map
>>> root@ceph-mon:/# ceph osd crush add-bucket osd.0 osd
>>> added bucket osd.0 type osd to crush map
>>> root@ceph-mon:/# ceph osd tree
>>> # id    weight  type name       up/down reweight
>>> -3      0       osd osd.0
>>> -1      0       root default
>>> -2      0               host osd-host
>>>
>>> root@ceph-mon:/# ceph osd crush move osd.0 host=osd-host
>>> 2014-02-18 16:00:14.093243 7ff077fff700  0 monclient: hunting for new mon
>>> 2014-02-18 16:00:14.093781 7ff07c130700  0 -- 172.17.0.160:0/1000148 >> 172.17.0.160:6789/0 pipe(0x7ff06c004770 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7ff06c0049d0).fault
>>> 2014-02-18 16:00:16.996981 7ff07c231700  0 -- 172.17.0.160:0/1000148 >> 172.17.0.160:6789/0 pipe(0x7ff060000c00 sd=5 :0 s=1 pgs=0 cs=0 l=1 c=0x7ff060000e60).fault
>>> 2014-02-18 16:00:19.998108 7ff07c130700  0 -- 172.17.0.160:0/1000148 >> 172.17.0.160:6789/0 pipe(0x7ff060003010 sd=5 :0 s=1 pgs=0 cs=0 l=1 c=0x7ff060001e70).fault
>>>
>>> Log file of ceph mon shows:
>>>
>>> *** Caught signal (Segmentation fault) **
>>> in thread 7f09109dd700
>>> ceph version 0.75 (946d60369589d6a269938edd65c0a6a7b1c3ef5c)
>>> 1: /usr/local/bin/ceph-mon() [0x83457e]
>>> 2: (()+0xf210) [0x7f0915772210]
>>> 3: /usr/local/bin/ceph-mon() [0x7c398a]
>>> 4: /usr/local/bin/ceph-mon() [0x7c3c9c]
>>> 5: /usr/local/bin/ceph-mon() [0x7c3d31]
>>> 6: (crush_do_rule()+0x20a) [0x7c448a]
>>> 7: (OSDMap::_pg_to_osds(pg_pool_t const&, pg_t, std::vector<int, std::allocator<int> >&) const+0xdd) [0x725add]
>>> 8: (OSDMap::pg_to_acting_osds(pg_t, std::vector<int, std::allocator<int> >&) const+0x81) [0x725da1]
>>> 9: (PGMonitor::map_pg_creates()+0x15f) [0x610abf]
>>> 10: (PGMonitor::post_paxos_update()+0x25) [0x611205]
>>> 11: (Monitor::refresh_from_paxos(bool*)+0x95) [0x543205]
>>> 12: (Paxos::do_refresh()+0x24) [0x590c24]
>>> 13: (Paxos::begin(ceph::buffer::list&)+0x99e) [0x59b54e]
>>> 14: (Paxos::propose_queued()+0xdd) [0x59b92d]
>>> 15: (Paxos::propose_new_value(ceph::buffer::list&, Context*)+0x150) [0x59ca30]
>>> 16: (PaxosService::propose_pending()+0x6d9) [0x5a3099]
>>> 17: (PaxosService::dispatch(PaxosServiceMessage*)+0xd77) [0x5a4347]
>>> 18: (Monitor::handle_command(MMonCommand*)+0x1073) [0x56e253]
>>> 19: (Monitor::dispatch(MonSession*, Message*, bool)+0x2e8) [0x571168]
>>> 20: (Monitor::_ms_dispatch(Message*)+0x1e4) [0x571774]
>>> 21: (Monitor::ms_dispatch(Message*)+0x20) [0x590050]
>>> 22: (DispatchQueue::entry()+0x56a) [0x80a65a]
>>> 23: (DispatchQueue::DispatchThread::entry()+0xd) [0x73e75d]
>>> 24: (()+0x7e0e) [0x7f091576ae0e]
>>> 25: (clone()+0x6d) [0x7f0913d1c0fd]
>>> 2014-02-18 16:00:14.088851 7f09109dd700 -1 *** Caught signal (Segmentation fault
>>> ) **
>>> in thread 7f09109dd700
>>>
>>> ceph version 0.75 (946d60369589d6a269938edd65c0a6a7b1c3ef5c)
>>> 1: /usr/local/bin/ceph-mon() [0x83457e]
>>> 2: (()+0xf210) [0x7f0915772210]
>>> 3: /usr/local/bin/ceph-mon() [0x7c398a]
>>> 4: /usr/local/bin/ceph-mon() [0x7c3c9c]
>>> 5: /usr/local/bin/ceph-mon() [0x7c3d31]
>>> 6: (crush_do_rule()+0x20a) [0x7c448a]
>>> 7: (OSDMap::_pg_to_osds(pg_pool_t const&, pg_t, std::vector<int, std::allocator
>>> <int> >&) const+0xdd) [0x725add]
>>> 8: (OSDMap::pg_to_acting_osds(pg_t, std::vector<int, std::allocator<int> >&) co
>>> nst+0x81) [0x725da1]
>>> 9: (PGMonitor::map_pg_creates()+0x15f) [0x610abf]
>>> 10: (PGMonitor::post_paxos_update()+0x25) [0x611205]
>>> 11: (Monitor::refresh_from_paxos(bool*)+0x95) [0x543205]
>>> 12: (Paxos::do_refresh()+0x24) [0x590c24]
>>> 13: (Paxos::begin(ceph::buffer::list&)+0x99e) [0x59b54e]
>>> 14: (Paxos::propose_queued()+0xdd) [0x59b92d]
>>> 15: (Paxos::propose_new_value(ceph::buffer::list&, Context*)+0x150) [0x59ca30]
>>> 16: (PaxosService::propose_pending()+0x6d9) [0x5a3099]
>>> 17: (PaxosService::dispatch(PaxosServiceMessage*)+0xd77) [0x5a4347]
>>> 18: (Monitor::handle_command(MMonCommand*)+0x1073) [0x56e253]
>>> 19: (Monitor::dispatch(MonSession*, Message*, bool)+0x2e8) [0x571168]
>>> 20: (Monitor::_ms_dispatch(Message*)+0x1e4) [0x571774]
>>> 21: (Monitor::ms_dispatch(Message*)+0x20) [0x590050]
>>> 22: (DispatchQueue::entry()+0x56a) [0x80a65a]
>>> 23: (DispatchQueue::DispatchThread::entry()+0xd) [0x73e75d]
>>> 24: (()+0x7e0e) [0x7f091576ae0e]
>>> 25: (clone()+0x6d) [0x7f0913d1c0fd]
>>> NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to int
>>> erpret this.
>>>
>>> --- begin dump of recent events ---
>>>  -395> 2014-02-18 15:59:09.388974 7f0915dfb7c0  5 asok(0x354af50) register_command perfcounters_dump hook 0x3542010
>>>  -394> 2014-02-18 15:59:09.389006 7f0915dfb7c0  5 asok(0x354af50) register_command 1 hook 0x3542010
>>>  -393> 2014-02-18 15:59:09.389011 7f0915dfb7c0  5 asok(0x354af50) register_command perf dump hook 0x3542010
>>>  -392> 2014-02-18 15:59:09.389016 7f0915dfb7c0  5 asok(0x354af50) register_command perfcounters_schema hook 0x3542010
>>>  -391> 2014-02-18 15:59:09.389020 7f0915dfb7c0  5 asok(0x354af50) register_command 2 hook 0x3542010
>>>  -390> 2014-02-18 15:59:09.389021 7f0915dfb7c0  5 asok(0x354af50) register_command perf schema hook 0x3542010
>>>  -389> 2014-02-18 15:59:09.389023 7f0915dfb7c0  5 asok(0x354af50) register_command config show hook 0x3542010
>>>  -388> 2014-02-18 15:59:09.389028 7f0915dfb7c0  5 asok(0x354af50) register_command config set hook 0x3542010
>>>  -387> 2014-02-18 15:59:09.389029 7f0915dfb7c0  5 asok(0x354af50) register_command config get hook 0x3542010
>>>  -386> 2014-02-18 15:59:09.389031 7f0915dfb7c0  5 asok(0x354af50) register_command log flush hook 0x3542010
>>>  -385> 2014-02-18 15:59:09.389035 7f0915dfb7c0  5 asok(0x354af50) register_command log dump hook 0x3542010
>>>  -384> 2014-02-18 15:59:09.389037 7f0915dfb7c0  5 asok(0x354af50) register_command log reopen hook 0x3542010
>>>  -383> 2014-02-18 15:59:09.390539 7f0915dfb7c0  0 ceph version 0.75 (946d60369589d6a269938edd65c0a6a7b1c3ef5c), process ceph-mon, pid 6
>>>  -382> 2014-02-18 15:59:09.390870 7f0915dfb7c0  5 asok(0x354af50) init /var/run/ceph/ceph-mon.ceph-mon.dkctl.asok
>>>  -381> 2014-02-18 15:59:09.390898 7f0915dfb7c0  5 asok(0x354af50) bind_and_listen /var/run/ceph/ceph-mon.ceph-mon.dkctl.asok
>>>  -380> 2014-02-18 15:59:09.391018 7f0915dfb7c0  5 asok(0x354af50) register_command 0 hook 0x353e038
>>>  -379> 2014-02-18 15:59:09.391043 7f0915dfb7c0  5 asok(0x354af50) register_command version hook 0x353e038
>>>  -378> 2014-02-18 15:59:09.391046 7f0915dfb7c0  5 asok(0x354af50) register_command git_version hook 0x353e038
>>>  -377> 2014-02-18 15:59:09.391049 7f0915dfb7c0  5 asok(0x354af50) register_command help hook 0x3542050
>>>  -376> 2014-02-18 15:59:09.391051 7f0915dfb7c0  5 asok(0x354af50) register_command get_command_descriptions hook 0x3542040
>>>  -375> 2014-02-18 15:59:09.391104 7f09121e0700  5 asok(0x354af50) entry start
>>>  -374> 2014-02-18 15:59:09.459305 7f0915dfb7c0  1 -- 172.17.0.160:6789/0 learned my addr 172.17.0.160:6789/0
>>>  -373> 2014-02-18 15:59:09.459333 7f0915dfb7c0  1 accepter.accepter.bind my_inst.addr is 172.17.0.160:6789/0 need_addr=0
>>>  -372> 2014-02-18 15:59:09.459359 7f0915dfb7c0  5 adding auth protocol: cephx
>>>  -371> 2014-02-18 15:59:09.459363 7f0915dfb7c0  5 adding auth protocol: cephx
>>>  -370> 2014-02-18 15:59:09.459451 7f0915dfb7c0  1 mon.ceph-mon.dkctl@-1(probing) e1 preinit fsid e90dfd37-98d1-45bb-a847-8590a5ed8e71
>>>  -369> 2014-02-18 15:59:09.459512 7f0915dfb7c0  1 mon.ceph-mon.dkctl@-1(probing) e1  initial_members ceph-mon.dkctl, filtering seed monmap
>>>  -368> 2014-02-18 15:59:09.459524 7f0915dfb7c0  1  keeping ceph-mon.dkctl 172.17.0.160:6789/0
>>>  -367> 2014-02-18 15:59:09.459812 7f0915dfb7c0  2 auth: KeyRing::load: loaded key file /data/mon/keyring
>>>  -366> 2014-02-18 15:59:09.459832 7f0915dfb7c0  5 asok(0x354af50) register_command mon_status hook 0x35420e0
>>>  -365> 2014-02-18 15:59:09.459838 7f0915dfb7c0  5 asok(0x354af50) register_command quorum_status hook 0x35420e0
>>>  -364> 2014-02-18 15:59:09.459840 7f0915dfb7c0  5 asok(0x354af50) register_command sync_force hook 0x35420e0
>>>  -363> 2014-02-18 15:59:09.459842 7f0915dfb7c0  5 asok(0x354af50) register_command add_bootstrap_peer_hint hook 0x35420e0
>>>  -362> 2014-02-18 15:59:09.459844 7f0915dfb7c0  5 asok(0x354af50) register_command quorum enter hook 0x35420e0
>>>  -361> 2014-02-18 15:59:09.459845 7f0915dfb7c0  5 asok(0x354af50) register_command quorum exit hook 0x35420e0
>>>  -360> 2014-02-18 15:59:09.459851 7f0915dfb7c0  1 -- 172.17.0.160:6789/0 messenger.start
>>>  -359> 2014-02-18 15:59:09.459917 7f0915dfb7c0  2 mon.ceph-mon.dkctl@-1(probing) e1 init
>>>  -358> 2014-02-18 15:59:09.459979 7f0915dfb7c0  1 accepter.accepter.start
>>>  -357> 2014-02-18 15:59:09.460029 7f0915dfb7c0  0 mon.ceph-mon.dkctl@-1(probing) e1  my rank is now 0 (was -1)
>>>  -356> 2014-02-18 15:59:09.460033 7f0915dfb7c0  1 -- 172.17.0.160:6789/0 mark_down_all
>>>  -355> 2014-02-18 15:59:09.460045 7f0915dfb7c0  1 mon.ceph-mon.dkctl@0(probing) e1 win_standalone_election
>>>  -354> 2014-02-18 15:59:09.482424 7f0915dfb7c0  0 log [INF] : mon.ceph-mon.dkctl@0 won leader election with quorum 0
>>>  -353> 2014-02-18 15:59:09.482450 7f0915dfb7c0 10 send_log to self
>>>  -352> 2014-02-18 15:59:09.482453 7f0915dfb7c0 10  log_queue is 1 last_log 1 sent 0 num 1 unsent 1 sending 1
>>>  -351> 2014-02-18 15:59:09.482457 7f0915dfb7c0 10  will send 2014-02-18 15:59:09.482449 mon.0 172.17.0.160:6789/0 1 : [INF] mon.ceph-mon.dkctl@0 won leader election with quorum 0
>>>  -350> 2014-02-18 15:59:09.482491 7f0915dfb7c0  1 -- 172.17.0.160:6789/0 --> mon.0 172.17.0.160:6789/0 -- log(1 entries) v1 -- ?+0 0x35866c0
>>>  -349> 2014-02-18 15:59:09.482564 7f09109dd700  1 -- 172.17.0.160:6789/0 <== mon.0 172.17.0.160:6789/0 0 ==== log(1 entries) v1 ==== 0+0+0 (0 0 0) 0x35866c0 con 0x359a420
>>>  -348> 2014-02-18 15:59:09.482598 7f0915dfb7c0  5 mon.ceph-mon.dkctl@0(leader).paxos(paxos active c 0..0) queue_proposal bl 398 bytes; ctx = 0x35420c0
>>>  -347> 2014-02-18 15:59:09.530752 7f0915dfb7c0  0 log [INF] : pgmap v1: 0 pgs: ; 0 bytes data, 0 kB used, 0 kB / 0 kB avail
>>>  -346> 2014-02-18 15:59:09.530776 7f0915dfb7c0 10 send_log to self
>>>  -345> 2014-02-18 15:59:09.530778 7f0915dfb7c0 10  log_queue is 2 last_log 2 sent 1 num 2 unsent 1 sending 1
>>>  -344> 2014-02-18 15:59:09.530781 7f0915dfb7c0 10  will send 2014-02-18 15:59:09.482449 mon.0 172.17.0.160:6789/0 1 : [INF] mon.ceph-mon.dkctl@0 won leader election with quorum 0
>>>  -343> 2014-02-18 15:59:09.530808 7f0915dfb7c0  1 -- 172.17.0.160:6789/0 --> mon.0 172.17.0.160:6789/0 -- log(1 entries) v1 -- ?+0 0x3586d80
>>>  -342> 2014-02-18 15:59:09.530898 7f0915dfb7c0  5 mon.ceph-mon.dkctl@0(leader).paxos(paxos active c 1..1) queue_proposal bl 477 bytes; ctx = 0x35420c0
>>>  -341> 2014-02-18 15:59:09.578860 7f0915dfb7c0  4 mon.ceph-mon.dkctl@0(leader).mds e1 new map
>>>  -340> 2014-02-18 15:59:09.578888 7f0915dfb7c0  0 mon.ceph-mon.dkctl@0(leader).mds e1 print_map
>>>
>>> With best regards,
>>>  Pavel.
>>>
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@xxxxxxxxxxxxxx
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux