Re: kraken <--> luminous osd map decoding

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



So far the summary of pass/fail from messing around with the test matrix is:

- Everything is installed fresh from download.ceph.com

- The OSD in all tests is:
  Ubuntu Xenial + Latest Luminous

- Client Setup 1:

Client: jessie, trusty, xenial, yakkety, zesty + Kraken
  - buffer error from above
Client: centos7, fedora 23 + Kraken
  - timeouts on connect
Client: fedora 24, 25 + Kraken
  - everything works fine

- Client Setup 2:

Same as setup 1 but used Luminous in Debian/Ubuntu clients. All errors
went away. So only the timeouts remain on cento7 and fedora 23.

- Client Setup 3:

Same as setup 1 but used Luminous in rhel/centos/fedora and centos7
and fedora 23 still timeout.

We haven't really messed around with our CI matrix setup, so this all
started sometime in the last month or so.

On Fri, Jul 7, 2017 at 1:09 PM, Gregory Farnum <gfarnum@xxxxxxxxxx> wrote:
> Exactly what version/release is this?
>
> And are you saying everything works fine on newish Fedora nodes, but
> you get timeouts on Centos7 and that failure on Debian derivatives?
> That is super weird.
>
> On Fri, Jul 7, 2017 at 11:42 AM, Noah Watkins <noahwatkins@xxxxxxxxx> wrote:
>> An important thing I noticed:
>>
>> Fedora 24/25: no errors
>> CentOS 7 and Fedora 23: cluster.connect => -ETIMEDOUT
>> All debian/ubuntu: above error: ceph::buffer::end_of_buffer
>>
>> On Fri, Jul 7, 2017 at 11:39 AM, Noah Watkins <noahwatkins@xxxxxxxxx> wrote:
>>> I'm getting osd map decoding errors on cluster connection from a
>>> kraken client with a luminous backend. I searched around the mailing
>>> list but didn't see this reported, or any note on backwards
>>> incompatibility. Back trace:
>>>
>>> ===
>>>
>>> terminate called after throwing an instance of 'ceph::buffer::end_of_buffer'
>>>   what():  buffer::end_of_buffer
>>>
>>> Thread 9 "ms_dispatch" received signal SIGABRT, Aborted.
>>> [Switching to Thread 0x7fffd3fff700 (LWP 43060)]
>>> 0x00007fffee73c428 in __GI_raise (sig=sig@entry=6) at
>>> ../sysdeps/unix/sysv/linux/raise.c:54
>>> 54      ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
>>> (gdb) bt
>>> #0  0x00007fffee73c428 in __GI_raise (sig=sig@entry=6) at
>>> ../sysdeps/unix/sysv/linux/raise.c:54
>>> #1  0x00007fffee73e02a in __GI_abort () at abort.c:89
>>> #2  0x00007fffeed7684d in __gnu_cxx::__verbose_terminate_handler() ()
>>> from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
>>> #3  0x00007fffeed746b6 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
>>> #4  0x00007fffeed74701 in std::terminate() () from
>>> /usr/lib/x86_64-linux-gnu/libstdc++.so.6
>>> #5  0x00007fffeed74919 in __cxa_throw () from
>>> /usr/lib/x86_64-linux-gnu/libstdc++.so.6
>>> #6  0x00007fffef49cbb2 in ceph::buffer::ptr::iterator::get_pos_add
>>> (this=<optimized out>, this=<optimized out>, n=4) at
>>> /home/nwatkins/ceph/src/include/buffer.h:196
>>> #7  0x00007fffef4b5509 in ceph::buffer::ptr::iterator::get_pos_add
>>> (this=<synthetic pointer>, this=<synthetic pointer>, n=4) at
>>> /usr/include/c++/5/bits/stl_vector.h:676
>>> #8  denc_traits<unsigned int, void>::decode (p=<synthetic pointer>,
>>> o=<optimized out>) at /home/nwatkins/ceph/src/include/denc.h:224
>>> #9  denc<unsigned int, denc_traits<unsigned int, void> > (features=0,
>>> p=<synthetic pointer>, o=<optimized out>) at
>>> /home/nwatkins/ceph/src/include/denc.h:496
>>> #10 denc_traits<std::vector<unsigned int, std::allocator<unsigned int>
>>>>, void>::decode (p=<synthetic pointer>, s=std::vector of length
>>> 16777216, capacity 16777216 = {...})
>>>     at /home/nwatkins/ceph/src/include/denc.h:796
>>> #11 decode<std::vector<unsigned int, std::allocator<unsigned int> >,
>>> denc_traits<std::vector<unsigned int, std::allocator<unsigned int> >,
>>> void> > (
>>>     o=std::vector of length 16777216, capacity 16777216 = {...},
>>> p=...) at /home/nwatkins/ceph/src/include/denc.h:1157
>>> #12 0x00007fffef4aee0c in OSDMap::decode (this=this@entry=0x817f20,
>>> bl=...) at /home/nwatkins/ceph/src/osd/OSDMap.cc:2144
>>> #13 0x00007fffef4b0b2e in OSDMap::decode (this=0x817f20, bl=...) at
>>> /home/nwatkins/ceph/src/osd/OSDMap.cc:1991
>>> #14 0x00007fffef387d2b in Objecter::handle_osd_map
>>> (this=this@entry=0x8178a0, m=m@entry=0x7fffd80012e0) at
>>> /home/nwatkins/ceph/src/osdc/Objecter.cc:1242
>>> #15 0x00007fffef388a87 in Objecter::ms_dispatch (this=0x8178a0,
>>> m=0x7fffd80012e0) at /home/nwatkins/ceph/src/osdc/Objecter.cc:1005
>>> #16 0x00007fffef5fd8ca in Messenger::ms_deliver_dispatch
>>> (m=0x7fffd80012e0, this=0x789fb0) at
>>> /home/nwatkins/ceph/src/msg/Messenger.h:593
>>> #17 DispatchQueue::entry (this=0x78a128) at
>>> /home/nwatkins/ceph/src/msg/DispatchQueue.cc:197
>>> #18 0x00007fffef47968d in DispatchQueue::DispatchThread::entry
>>> (this=<optimized out>) at
>>> /home/nwatkins/ceph/src/msg/DispatchQueue.h:103
>>> #19 0x00007fffef0706ba in start_thread (arg=0x7fffd3fff700) at
>>> pthread_create.c:333
>>> #20 0x00007fffee80e3dd in clone () at
>>> ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux