Re: ceph-mgr SIGABRTs on startup after cluster upgrade from Kraken to Luminous

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



It seems like it's choking on the report from the rados gateway. What
version is the rgw node running?

If possible, could you shut down the rgw and see if you can then start ceph-mgr?

Pure stab in the dark just to see if the problem is tied to the rgw instance.

On Tue, Sep 12, 2017 at 1:07 PM, Katie Holly <holly@xxxxxxxxx> wrote:
> Thanks, I totally forgot to check the tracker. I added the information I collected there, but don't have enough experience with ceph to dig through this myself so let's see if someone is willing to sacrifice their free time to help debugging this issue.
>
> --
> Katie
>
> On 2017-09-12 03:15, Brad Hubbard wrote:
>> Looks like there is a tracker opened for this.
>>
>> http://tracker.ceph.com/issues/21197
>>
>> Please add your details there.
>>
>> On Tue, Sep 12, 2017 at 11:04 AM, Katie Holly <holly@xxxxxxxxx> wrote:
>>> Hi,
>>>
>>> I recently upgraded one of our clusters from Kraken to Luminous (the cluster was initialized with Jewel) on Ubuntu 16.04 and deployed ceph-mgr on all of our ceph-mon nodes with ceph-deploy.
>>>
>>> Related log entries after initial deployment of ceph-mgr:
>>>
>>> 2017-09-11 06:41:53.535025 7fb5aa7b8500  0 set uid:gid to 64045:64045 (ceph:ceph)
>>> 2017-09-11 06:41:53.535048 7fb5aa7b8500  0 ceph version 12.2.0 (32ce2a3ae5239ee33d6150705cdb24d43bab910c) luminous (rc), process (unknown), pid 17031
>>> 2017-09-11 06:41:53.536853 7fb5aa7b8500  0 pidfile_write: ignore empty --pid-file
>>> 2017-09-11 06:41:53.541880 7fb5aa7b8500  1 mgr send_beacon standby
>>> 2017-09-11 06:41:54.547383 7fb5a1aec700  1 mgr handle_mgr_map Activating!
>>> 2017-09-11 06:41:54.547575 7fb5a1aec700  1 mgr handle_mgr_map I am now activating
>>> 2017-09-11 06:41:54.650677 7fb59dae4700  1 mgr start Creating threads for 0 modules
>>> 2017-09-11 06:41:54.650696 7fb59dae4700  1 mgr send_beacon active
>>> 2017-09-11 06:41:55.542252 7fb59eae6700  1 mgr send_beacon active
>>> 2017-09-11 06:41:55.542627 7fb59eae6700  1 mgr.server send_report Not sending PG status to monitor yet, waiting for OSDs
>>> 2017-09-11 06:41:57.542697 7fb59eae6700  1 mgr send_beacon active
>>> [... lots of "send_beacon active" messages ...]
>>> 2017-09-11 07:29:29.640892 7fb59eae6700  1 mgr send_beacon active
>>> 2017-09-11 07:29:30.866366 7fb59d2e3700 -1 *** Caught signal (Aborted) **
>>>  in thread 7fb59d2e3700 thread_name:ms_dispatch
>>>
>>>  ceph version 12.2.0 (32ce2a3ae5239ee33d6150705cdb24d43bab910c) luminous (rc)
>>>  1: (()+0x3de6b4) [0x55f6640e16b4]
>>>  2: (()+0x11390) [0x7fb5a8fef390]
>>>  3: (gsignal()+0x38) [0x7fb5a7f7f428]
>>>  4: (abort()+0x16a) [0x7fb5a7f8102a]
>>>  5: (__gnu_cxx::__verbose_terminate_handler()+0x16d) [0x7fb5a88c284d]
>>>  6: (()+0x8d6b6) [0x7fb5a88c06b6]
>>>  7: (()+0x8d701) [0x7fb5a88c0701]
>>>  8: (()+0x8d919) [0x7fb5a88c0919]
>>>  9: (()+0x2318ad) [0x55f663f348ad]
>>>  10: (()+0x3e91bd) [0x55f6640ec1bd]
>>>  11: (DaemonPerfCounters::update(MMgrReport*)+0x821) [0x55f663f96651]
>>>  12: (DaemonServer::handle_report(MMgrReport*)+0x1ae) [0x55f663f9b79e]+
>>>  13: (DaemonServer::ms_dispatch(Message*)+0x64) [0x55f663fa8d64]
>>>  14: (DispatchQueue::entry()+0xf4a) [0x55f664438f3a]
>>>  15: (DispatchQueue::DispatchThread::entry()+0xd) [0x55f6641dc44d]
>>>  16: (()+0x76ba) [0x7fb5a8fe56ba]
>>>  17: (clone()+0x6d) [0x7fb5a80513dd]
>>>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
>>>
>>> --- begin dump of recent events ---
>>> [...]
>>>
>>>
>>> I tried to manually run ceph-mgr with
>>>> /usr/bin/ceph-mgr -f --cluster ceph --id $HOSTNAME --setuser ceph --setgroup ceph
>>> which immediately fails to keep running for longer than a few seconds.
>>> stdout: http://xor.meo.ws/OyvoZF8v0aWq0D-rOOg2y6u03fp_yzYv.txt
>>> logs: http://xor.meo.ws/jcMyjabCfFbTcfZ8GOangLdSfSSqJffr.txt
>>> objdump: http://xor.meo.ws/oxo2q8h_oKAG6q7mARvNKkR_JdYjn89B.txt
>>>
>>> Has someone seen such an issue before and knows how to debug or even fix this?
>>>
>>>
>>> --
>>> Katie
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@xxxxxxxxxxxxxx
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>>



-- 
Cheers,
Brad
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux