Re: Ceph Status - Segmentation Fault

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, May 25, 2016 at 3:00 PM, Mathias Buresch
<mathias.buresch@xxxxxxxxxxxx> wrote:
> I don't know what exactly is segfaulting.
>
> Here ist the output with command line flags and gdb (I can't really
> notice erros in that output):
>
> # ceph -s --debug-monc=20 --debug-ms=20
> 2016-05-25 14:51:02.406135 7f188300a700 10 monclient(hunting):
> build_initial_monmap
> 2016-05-25 14:51:02.406444 7f188300a700 10 -- :/0 ready :/0
> 2016-05-25 14:51:02.407214 7f188300a700  1 -- :/0 messenger.start
> 2016-05-25 14:51:02.407261 7f188300a700 10 monclient(hunting): init
> 2016-05-25 14:51:02.407291 7f188300a700 10 monclient(hunting):
> auth_supported 2 method cephx
> 2016-05-25 14:51:02.407312 7f187b7fe700 10 -- :/2987460054 reaper_entry
> start
> 2016-05-25 14:51:02.407380 7f187b7fe700 10 -- :/2987460054 reaper
> 2016-05-25 14:51:02.407383 7f187b7fe700 10 -- :/2987460054 reaper done
> 2016-05-25 14:51:02.407638 7f188300a700 10 monclient(hunting):
> _reopen_session rank -1 name
> 2016-05-25 14:51:02.407646 7f188300a700 10 -- :/2987460054 connect_rank
> to 62.176.141.181:6789/0, creating pipe and registering
> 2016-05-25 14:51:02.407686 7f188300a700 10 -- :/2987460054 >>
> 62.176.141.181:6789/0 pipe(0x7f187c064010 sd=-1 :0 s=1 pgs=0 cs=0 l=1
> c=0x7f187c05aa50).register_pipe
> 2016-05-25 14:51:02.407698 7f188300a700 10 -- :/2987460054
> get_connection mon.0 62.176.141.181:6789/0 new 0x7f187c064010
> 2016-05-25 14:51:02.407693 7f1879ffb700 10 -- :/2987460054 >>
> 62.176.141.181:6789/0 pipe(0x7f187c064010 sd=-1 :0 s=1 pgs=0 cs=0 l=1
> c=0x7f187c05aa50).writer: state = connecting policy.server=0
> 2016-05-25 14:51:02.407723 7f1879ffb700 10 -- :/2987460054 >>
> 62.176.141.181:6789/0 pipe(0x7f187c064010 sd=-1 :0 s=1 pgs=0 cs=0 l=1
> c=0x7f187c05aa50).connect 0
> 2016-05-25 14:51:02.407738 7f188300a700 10 monclient(hunting): picked
> mon.pix01 con 0x7f187c05aa50 addr 62.176.141.181:6789/0
> 2016-05-25 14:51:02.407745 7f188300a700 20 -- :/2987460054
> send_keepalive con 0x7f187c05aa50, have pipe.
> 2016-05-25 14:51:02.407744 7f1879ffb700 10 -- :/2987460054 >>
> 62.176.141.181:6789/0 pipe(0x7f187c064010 sd=3 :0 s=1 pgs=0 cs=0 l=1
> c=0x7f187c05aa50).connecting to 62.176.141.181:6789/0
> 2016-05-25 14:51:02.407759 7f188300a700 10 monclient(hunting):
> _send_mon_message to mon.pix01 at 62.176.141.181:6789/0
> 2016-05-25 14:51:02.407763 7f188300a700  1 -- :/2987460054 -->
> 62.176.141.181:6789/0 -- auth(proto 0 30 bytes epoch 0) v1 -- ?+0
> 0x7f187c060380 con 0x7f187c05aa50
> 2016-05-25 14:51:02.407768 7f188300a700 20 -- :/2987460054
> submit_message auth(proto 0 30 bytes epoch 0) v1 remote,
> 62.176.141.181:6789/0, have pipe.
> 2016-05-25 14:51:02.407773 7f188300a700 10 monclient(hunting):
> renew_subs
> 2016-05-25 14:51:02.407777 7f188300a700 10 monclient(hunting):
> authenticate will time out at 2016-05-25 14:56:02.407777
> 2016-05-25 14:51:02.408128 7f1879ffb700 20 -- :/2987460054 >>
> 62.176.141.181:6789/0 pipe(0x7f187c064010 sd=3 :37964 s=1 pgs=0 cs=0
> l=1 c=0x7f187c05aa50).connect read peer addr 62.176.141.181:6789/0 on
> socket 3
> 2016-05-25 14:51:02.408144 7f1879ffb700 20 -- :/2987460054 >>
> 62.176.141.181:6789/0 pipe(0x7f187c064010 sd=3 :37964 s=1 pgs=0 cs=0
> l=1 c=0x7f187c05aa50).connect peer addr for me is
> 62.176.141.181:37964/0
> 2016-05-25 14:51:02.408148 7f1879ffb700  1 --
> 62.176.141.181:0/2987460054 learned my addr 62.176.141.181:0/2987460054
> 2016-05-25 14:51:02.408188 7f1879ffb700 10 --
> 62.176.141.181:0/2987460054 >> 62.176.141.181:6789/0
> pipe(0x7f187c064010 sd=3 :37964 s=1 pgs=0 cs=0 l=1
> c=0x7f187c05aa50).connect sent my addr 62.176.141.181:0/2987460054
> 2016-05-25 14:51:02.408197 7f1879ffb700 10 --
> 62.176.141.181:0/2987460054 >> 62.176.141.181:6789/0
> pipe(0x7f187c064010 sd=3 :37964 s=1 pgs=0 cs=0 l=1
> c=0x7f187c05aa50).connect sending gseq=1 cseq=0 proto=15
> 2016-05-25 14:51:02.408207 7f1879ffb700 20 --
> 62.176.141.181:0/2987460054 >> 62.176.141.181:6789/0
> pipe(0x7f187c064010 sd=3 :37964 s=1 pgs=0 cs=0 l=1
> c=0x7f187c05aa50).connect wrote (self +) cseq, waiting for reply
> 2016-05-25 14:51:02.408259 7f1879ffb700 20 --
> 62.176.141.181:0/2987460054 >> 62.176.141.181:6789/0
> pipe(0x7f187c064010 sd=3 :37964 s=1 pgs=0 cs=0 l=1
> c=0x7f187c05aa50).connect got reply tag 1 connect_seq 1 global_seq
> 327710 proto 15 flags 1 features 55169095435288575
> 2016-05-25 14:51:02.408269 7f1879ffb700 10 --
> 62.176.141.181:0/2987460054 >> 62.176.141.181:6789/0
> pipe(0x7f187c064010 sd=3 :37964 s=2 pgs=327710 cs=1 l=1
> c=0x7f187c05aa50).connect success 1, lossy = 1, features
> 55169095435288575
> 2016-05-25 14:51:02.408280 7f1879ffb700 20 --
> 62.176.141.181:0/2987460054 >> 62.176.141.181:6789/0
> pipe(0x7f187c064010 sd=3 :37964 s=2 pgs=327710 cs=1 l=1
> c=0x7f187c05aa50).connect starting reader
> 2016-05-25 14:51:02.408325 7f1879ffb700 10 --
> 62.176.141.181:0/2987460054 >> 62.176.141.181:6789/0
> pipe(0x7f187c064010 sd=3 :37964 s=2 pgs=327710 cs=1 l=1
> c=0x7f187c05aa50).writer: state = open policy.server=0
> 2016-05-25 14:51:02.408343 7f1879ffb700 10 --
> 62.176.141.181:0/2987460054 >> 62.176.141.181:6789/0
> pipe(0x7f187c064010 sd=3 :37964 s=2 pgs=327710 cs=1 l=1
> c=0x7f187c05aa50).write_keepalive2 14 2016-05-25 14:51:02.408342
> 2016-05-25 14:51:02.408378 7f1879ffb700 20 --
> 62.176.141.181:0/2987460054 >> 62.176.141.181:6789/0
> pipe(0x7f187c064010 sd=3 :37964 s=2 pgs=327710 cs=1 l=1
> c=0x7f187c05aa50).writer encoding 1 features 55169095435288575
> 0x7f187c060380 auth(proto 0 30 bytes epoch 0) v1
> 2016-05-25 14:51:02.408356 7f1879efa700 20 --
> 62.176.141.181:0/2987460054 >> 62.176.141.181:6789/0
> pipe(0x7f187c064010 sd=3 :37964 s=2 pgs=327710 cs=1 l=1
> c=0x7f187c05aa50).reader reading tag...
> 2016-05-25 14:51:02.408406 7f1879ffb700 20 --
> 62.176.141.181:0/2987460054 >> 62.176.141.181:6789/0
> pipe(0x7f187c064010 sd=3 :37964 s=2 pgs=327710 cs=1 l=1
> c=0x7f187c05aa50).writer no session security
> 2016-05-25 14:51:02.408415 7f1879ffb700 20 --
> 62.176.141.181:0/2987460054 >> 62.176.141.181:6789/0
> pipe(0x7f187c064010 sd=3 :37964 s=2 pgs=327710 cs=1 l=1
> c=0x7f187c05aa50).writer sending 1 0x7f187c060380
> 2016-05-25 14:51:02.408453 7f1879ffb700 10 --
> 62.176.141.181:0/2987460054 >> 62.176.141.181:6789/0
> pipe(0x7f187c064010 sd=3 :37964 s=2 pgs=327710 cs=1 l=1
> c=0x7f187c05aa50).writer: state = open policy.server=0
> 2016-05-25 14:51:02.408455 7f1879efa700 20 --
> 62.176.141.181:0/2987460054 >> 62.176.141.181:6789/0
> pipe(0x7f187c064010 sd=3 :37964 s=2 pgs=327710 cs=1 l=1
> c=0x7f187c05aa50).reader got KEEPALIVE_ACK
> 2016-05-25 14:51:02.408463 7f1879ffb700 20 --
> 62.176.141.181:0/2987460054 >> 62.176.141.181:6789/0
> pipe(0x7f187c064010 sd=3 :37964 s=2 pgs=327710 cs=1 l=1
> c=0x7f187c05aa50).writer sleeping
> 2016-05-25 14:51:02.408482 7f1879efa700 20 --
> 62.176.141.181:0/2987460054 >> 62.176.141.181:6789/0
> pipe(0x7f187c064010 sd=3 :37964 s=2 pgs=327710 cs=1 l=1
> c=0x7f187c05aa50).reader reading tag...
> 2016-05-25 14:51:02.408696 7f1879efa700 20 --
> 62.176.141.181:0/2987460054 >> 62.176.141.181:6789/0
> pipe(0x7f187c064010 sd=3 :37964 s=2 pgs=327710 cs=1 l=1
> c=0x7f187c05aa50).reader got ACK
> 2016-05-25 14:51:02.408713 7f1879efa700 15 --
> 62.176.141.181:0/2987460054 >> 62.176.141.181:6789/0
> pipe(0x7f187c064010 sd=3 :37964 s=2 pgs=327710 cs=1 l=1
> c=0x7f187c05aa50).reader got ack seq 1
> 2016-05-25 14:51:02.408721 7f1879efa700 20 --
> 62.176.141.181:0/2987460054 >> 62.176.141.181:6789/0
> pipe(0x7f187c064010 sd=3 :37964 s=2 pgs=327710 cs=1 l=1
> c=0x7f187c05aa50).reader reading tag...
> 2016-05-25 14:51:02.408732 7f1879efa700 20 --
> 62.176.141.181:0/2987460054 >> 62.176.141.181:6789/0
> pipe(0x7f187c064010 sd=3 :37964 s=2 pgs=327710 cs=1 l=1
> c=0x7f187c05aa50).reader got MSG
> 2016-05-25 14:51:02.408739 7f1879efa700 20 --
> 62.176.141.181:0/2987460054 >> 62.176.141.181:6789/0
> pipe(0x7f187c064010 sd=3 :37964 s=2 pgs=327710 cs=1 l=1
> c=0x7f187c05aa50).reader got envelope type=4 src mon.0 front=340 data=0
> off 0
> 2016-05-25 14:51:02.408751 7f1879efa700 10 --
> 62.176.141.181:0/2987460054 >> 62.176.141.181:6789/0
> pipe(0x7f187c064010 sd=3 :37964 s=2 pgs=327710 cs=1 l=1
> c=0x7f187c05aa50).reader wants 340 from dispatch throttler 0/104857600
> 2016-05-25 14:51:02.408763 7f1879efa700 20 --
> 62.176.141.181:0/2987460054 >> 62.176.141.181:6789/0
> pipe(0x7f187c064010 sd=3 :37964 s=2 pgs=327710 cs=1 l=1
> c=0x7f187c05aa50).reader got front 340
> 2016-05-25 14:51:02.408770 7f1879efa700 10 --
> 62.176.141.181:0/2987460054 >> 62.176.141.181:6789/0
> pipe(0x7f187c064010 sd=3 :37964 s=2 pgs=327710 cs=1 l=1
> c=0x7f187c05aa50).aborted = 0
> 2016-05-25 14:51:02.408776 7f1879efa700 20 --
> 62.176.141.181:0/2987460054 >> 62.176.141.181:6789/0
> pipe(0x7f187c064010 sd=3 :37964 s=2 pgs=327710 cs=1 l=1
> c=0x7f187c05aa50).reader got 340 + 0 + 0 byte message
> 2016-05-25 14:51:02.408801 7f1879efa700 10 --
> 62.176.141.181:0/2987460054 >> 62.176.141.181:6789/0
> pipe(0x7f187c064010 sd=3 :37964 s=2 pgs=327710 cs=1 l=1
> c=0x7f187c05aa50).No session security set
> 2016-05-25 14:51:02.408813 7f1879efa700 10 --
> 62.176.141.181:0/2987460054 >> 62.176.141.181:6789/0
> pipe(0x7f187c064010 sd=3 :37964 s=2 pgs=327710 cs=1 l=1
> c=0x7f187c05aa50).reader got message 1 0x7f186c001cb0 mon_map magic: 0
> v1
> 2016-05-25 14:51:02.408827 7f1879efa700 20 --
> 62.176.141.181:0/2987460054 queue 0x7f186c001cb0 prio 196
> 2016-05-25 14:51:02.408837 7f1879efa700 20 --
> 62.176.141.181:0/2987460054 >> 62.176.141.181:6789/0
> pipe(0x7f187c064010 sd=3 :37964 s=2 pgs=327710 cs=1 l=1
> c=0x7f187c05aa50).reader reading tag...
> 2016-05-25 14:51:02.408851 7f1879ffb700 10 --
> 62.176.141.181:0/2987460054 >> 62.176.141.181:6789/0
> pipe(0x7f187c064010 sd=3 :37964 s=2 pgs=327710 cs=1 l=1
> c=0x7f187c05aa50).writer: state = open policy.server=0
> Segmentation fault
>
>
> (gdb) run /usr/bin/ceph status
> Starting program: /usr/bin/python /usr/bin/ceph status
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib/x86_64-linux-
> gnu/libthread_db.so.1".
> [New Thread 0x7ffff10f5700 (LWP 23401)]
> [New Thread 0x7ffff08f4700 (LWP 23402)]
> [Thread 0x7ffff10f5700 (LWP 23401) exited]
> [New Thread 0x7ffff10f5700 (LWP 23403)]
> [Thread 0x7ffff10f5700 (LWP 23403) exited]
> [New Thread 0x7ffff10f5700 (LWP 23404)]
> [Thread 0x7ffff10f5700 (LWP 23404) exited]
> [New Thread 0x7ffff10f5700 (LWP 23405)]
> [Thread 0x7ffff10f5700 (LWP 23405) exited]
> [New Thread 0x7ffff10f5700 (LWP 23406)]
> [Thread 0x7ffff10f5700 (LWP 23406) exited]
> [New Thread 0x7ffff10f5700 (LWP 23407)]
> [Thread 0x7ffff10f5700 (LWP 23407) exited]
> [New Thread 0x7ffff10f5700 (LWP 23408)]
> [New Thread 0x7fffeb885700 (LWP 23409)]
> [New Thread 0x7fffeb084700 (LWP 23410)]
> [New Thread 0x7fffea883700 (LWP 23411)]
> [New Thread 0x7fffea082700 (LWP 23412)]
> [New Thread 0x7fffe9881700 (LWP 23413)]
> [New Thread 0x7fffe9080700 (LWP 23414)]
> [New Thread 0x7fffe887f700 (LWP 23415)]
> [New Thread 0x7fffe807e700 (LWP 23416)]
> [New Thread 0x7fffe7f7d700 (LWP 23419)]
>
> Program received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 0x7fffea883700 (LWP 23411)]
> 0x00007ffff3141a57 in ?? () from /usr/lib/librados.so.2
> (gdb) bt
> #0  0x00007ffff3141a57 in ?? () from /usr/lib/librados.so.2
> #1  0x00007ffff313aff4 in ?? () from /usr/lib/librados.so.2
> #2  0x00007ffff2fe4a79 in ?? () from /usr/lib/librados.so.2
> #3  0x00007ffff2fe6507 in ?? () from /usr/lib/librados.so.2
> #4  0x00007ffff30d5dc9 in ?? () from /usr/lib/librados.so.2
> #5  0x00007ffff31023bd in ?? () from /usr/lib/librados.so.2
> #6  0x00007ffff7bc4182 in start_thread () from /lib/x86_64-linux-
> gnu/libpthread.so.0
> #7  0x00007ffff78f147d in clone () from /lib/x86_64-linux-gnu/libc.so.6
>
>
> Does that help? I cant really see where the error is. :)

Hmm, can you try getting that backtrace again after installing the
ceph-debuginfo package?  Also add --debug-rados=20 to your command
line (you can use all the --debug... options when you're running
inside gdb to get the logs and the backtrace in one).

John

>
> -----Original Message-----
> From: John Spray <jspray@xxxxxxxxxx>
> To: Mathias Buresch <mathias.buresch@xxxxxxxxxxxx>
> Cc: ceph-users@xxxxxxxx <ceph-users@xxxxxxxx>
> Subject: Re:  Ceph Status - Segmentation Fault
> Date: Wed, 25 May 2016 10:16:55 +0100
>
> On Mon, May 23, 2016 at 12:41 PM, Mathias Buresch
> <mathias.buresch@xxxxxxxxxxxx> wrote:
>>
>> Please found the logs with higher debug level attached to this email.
> You've attached the log from your mon, but it's not your mon that's
> segfaulting, right?
>
> You can use normal ceph command line flags to crank up the verbosity
> on the CLI too (--debug-monc=20 --debug-ms=20 spring to mind).
>
> You can also run the ceph CLI in gdb like this:
> gdb python
> (gdb) run /usr/bin/ceph status
> ... hopefully it crashes and then ...
> (gdb) bt
>
> Cheers,
> John
>
>>
>>
>>
>> Kind regards
>> Mathias
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux