Re: kraken <--> luminous osd map decoding

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Yeah, I need the actual commit IDs please so I make sure I'm looking
at the right place. Checking out kraken on my box isn't showing
anything sensible with your backtraces. ;)

Also your timeouts make me think something else has gone wrong for
you. We *do* test with clients that have differing versions, although
I'm not sure how complete the matrix is.

On Fri, Jul 7, 2017 at 1:43 PM, Noah Watkins <noahwatkins@xxxxxxxxx> wrote:
> So far the summary of pass/fail from messing around with the test matrix is:
>
> - Everything is installed fresh from download.ceph.com
>
> - The OSD in all tests is:
>   Ubuntu Xenial + Latest Luminous
>
> - Client Setup 1:
>
> Client: jessie, trusty, xenial, yakkety, zesty + Kraken
>   - buffer error from above
> Client: centos7, fedora 23 + Kraken
>   - timeouts on connect
> Client: fedora 24, 25 + Kraken
>   - everything works fine
>
> - Client Setup 2:
>
> Same as setup 1 but used Luminous in Debian/Ubuntu clients. All errors
> went away. So only the timeouts remain on cento7 and fedora 23.
>
> - Client Setup 3:
>
> Same as setup 1 but used Luminous in rhel/centos/fedora and centos7
> and fedora 23 still timeout.
>
> We haven't really messed around with our CI matrix setup, so this all
> started sometime in the last month or so.
>
> On Fri, Jul 7, 2017 at 1:09 PM, Gregory Farnum <gfarnum@xxxxxxxxxx> wrote:
>> Exactly what version/release is this?
>>
>> And are you saying everything works fine on newish Fedora nodes, but
>> you get timeouts on Centos7 and that failure on Debian derivatives?
>> That is super weird.
>>
>> On Fri, Jul 7, 2017 at 11:42 AM, Noah Watkins <noahwatkins@xxxxxxxxx> wrote:
>>> An important thing I noticed:
>>>
>>> Fedora 24/25: no errors
>>> CentOS 7 and Fedora 23: cluster.connect => -ETIMEDOUT
>>> All debian/ubuntu: above error: ceph::buffer::end_of_buffer
>>>
>>> On Fri, Jul 7, 2017 at 11:39 AM, Noah Watkins <noahwatkins@xxxxxxxxx> wrote:
>>>> I'm getting osd map decoding errors on cluster connection from a
>>>> kraken client with a luminous backend. I searched around the mailing
>>>> list but didn't see this reported, or any note on backwards
>>>> incompatibility. Back trace:
>>>>
>>>> ===
>>>>
>>>> terminate called after throwing an instance of 'ceph::buffer::end_of_buffer'
>>>>   what():  buffer::end_of_buffer
>>>>
>>>> Thread 9 "ms_dispatch" received signal SIGABRT, Aborted.
>>>> [Switching to Thread 0x7fffd3fff700 (LWP 43060)]
>>>> 0x00007fffee73c428 in __GI_raise (sig=sig@entry=6) at
>>>> ../sysdeps/unix/sysv/linux/raise.c:54
>>>> 54      ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
>>>> (gdb) bt
>>>> #0  0x00007fffee73c428 in __GI_raise (sig=sig@entry=6) at
>>>> ../sysdeps/unix/sysv/linux/raise.c:54
>>>> #1  0x00007fffee73e02a in __GI_abort () at abort.c:89
>>>> #2  0x00007fffeed7684d in __gnu_cxx::__verbose_terminate_handler() ()
>>>> from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
>>>> #3  0x00007fffeed746b6 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
>>>> #4  0x00007fffeed74701 in std::terminate() () from
>>>> /usr/lib/x86_64-linux-gnu/libstdc++.so.6
>>>> #5  0x00007fffeed74919 in __cxa_throw () from
>>>> /usr/lib/x86_64-linux-gnu/libstdc++.so.6
>>>> #6  0x00007fffef49cbb2 in ceph::buffer::ptr::iterator::get_pos_add
>>>> (this=<optimized out>, this=<optimized out>, n=4) at
>>>> /home/nwatkins/ceph/src/include/buffer.h:196
>>>> #7  0x00007fffef4b5509 in ceph::buffer::ptr::iterator::get_pos_add
>>>> (this=<synthetic pointer>, this=<synthetic pointer>, n=4) at
>>>> /usr/include/c++/5/bits/stl_vector.h:676
>>>> #8  denc_traits<unsigned int, void>::decode (p=<synthetic pointer>,
>>>> o=<optimized out>) at /home/nwatkins/ceph/src/include/denc.h:224
>>>> #9  denc<unsigned int, denc_traits<unsigned int, void> > (features=0,
>>>> p=<synthetic pointer>, o=<optimized out>) at
>>>> /home/nwatkins/ceph/src/include/denc.h:496
>>>> #10 denc_traits<std::vector<unsigned int, std::allocator<unsigned int>
>>>>>, void>::decode (p=<synthetic pointer>, s=std::vector of length
>>>> 16777216, capacity 16777216 = {...})
>>>>     at /home/nwatkins/ceph/src/include/denc.h:796
>>>> #11 decode<std::vector<unsigned int, std::allocator<unsigned int> >,
>>>> denc_traits<std::vector<unsigned int, std::allocator<unsigned int> >,
>>>> void> > (
>>>>     o=std::vector of length 16777216, capacity 16777216 = {...},
>>>> p=...) at /home/nwatkins/ceph/src/include/denc.h:1157
>>>> #12 0x00007fffef4aee0c in OSDMap::decode (this=this@entry=0x817f20,
>>>> bl=...) at /home/nwatkins/ceph/src/osd/OSDMap.cc:2144
>>>> #13 0x00007fffef4b0b2e in OSDMap::decode (this=0x817f20, bl=...) at
>>>> /home/nwatkins/ceph/src/osd/OSDMap.cc:1991
>>>> #14 0x00007fffef387d2b in Objecter::handle_osd_map
>>>> (this=this@entry=0x8178a0, m=m@entry=0x7fffd80012e0) at
>>>> /home/nwatkins/ceph/src/osdc/Objecter.cc:1242
>>>> #15 0x00007fffef388a87 in Objecter::ms_dispatch (this=0x8178a0,
>>>> m=0x7fffd80012e0) at /home/nwatkins/ceph/src/osdc/Objecter.cc:1005
>>>> #16 0x00007fffef5fd8ca in Messenger::ms_deliver_dispatch
>>>> (m=0x7fffd80012e0, this=0x789fb0) at
>>>> /home/nwatkins/ceph/src/msg/Messenger.h:593
>>>> #17 DispatchQueue::entry (this=0x78a128) at
>>>> /home/nwatkins/ceph/src/msg/DispatchQueue.cc:197
>>>> #18 0x00007fffef47968d in DispatchQueue::DispatchThread::entry
>>>> (this=<optimized out>) at
>>>> /home/nwatkins/ceph/src/msg/DispatchQueue.h:103
>>>> #19 0x00007fffef0706ba in start_thread (arg=0x7fffd3fff700) at
>>>> pthread_create.c:333
>>>> #20 0x00007fffee80e3dd in clone () at
>>>> ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux