Looks like the valgrind errors are being caused primarily by this, or something very similar. https://github.com/gperftools/gperftools/issues/792 On Sat, Dec 24, 2016 at 8:00 AM, Brad Hubbard <bhubbard@xxxxxxxxxx> wrote: > On Sat, Dec 24, 2016 at 3:06 AM, Sage Weil <sage@xxxxxxxxxxxx> wrote: >> On Fri, 23 Dec 2016, Sage Weil wrote: >>> On Fri, 23 Dec 2016, Sage Weil wrote: >>> > On Fri, 23 Dec 2016, Willem Jan Withagen wrote: >>> > > On 22-12-2016 23:01, Allen Samuels wrote: >>> > > > I believe I mis-read the data. What I've seen before doesn't fit this data. >>> > > > >>> > > > If it fails in unit test, it shouldn't be hard to just set a HW breakpoint on the vptr and see who the culprit is. >>> > > >>> > > Well the _raw class is properly initialised: >>> > > $8 = {_vptr$raw = 0x817578 <vtable for ceph::buffer::raw+16>, >>> > > >>> > > And is "upgraded" to a different class: >>> > > ceph::buffer::raw_combined >>> > > >>> > > But here is the problem... >>> > > >>> > > Hardware watchpoint 5: *0x1054048 >>> > > >>> > > Old value = 8484744 >>> > > New value = 0 >>> > > 0x000000000049ab29 in denc_traits<unsigned long, void>::encode >>> > > (o=@0x10501d0: 123, p=..., f=0) at >>> > > /usr/srcs/Ceph/work/ceph/src/include/denc.h:228 >>> > > 228 WRITE_INT_DENC(uint64_t, __le64); >>> > >>> > Ah, okay, this is making a bit more sense now. >>> > >>> > raw_combined is trying to be clever. It has a static create() method to >>> > >>> > - allocate an aligned buffer of size len + sizeof(raw_combined) (with some >>> > rounding up), call it *ptr >>> > - in-place initiailize raw_combined at ptr + len (rounded up), call it *this >>> > - store the the actual data at ptr >>> > - return the raw_combined *this >>> > >>> > It has an overloaded delete operator to free the entire allocate >>> > (roughly this - len) to make this work. >>> > >>> > It sounds a bit like the write into raw_combined's data portion (ptr) is >>> > clobbering vptr. Which probably suggests we wrote too much into the >>> > buffer. Can you print the contents of the raw_combined class and the >>> > appender? >>> >>> Hmm, I jumped to a much older commit and the valgrind warnigns go away; I >>> think it's related to the buffer mempools. Bisecting. >> >> Weird. The valgrind warnings are caused by >> e630d74c6457a7f1fe71fb4ed2f15a7ef9114fe8, which makes unittest_denc >> link against tcmalloc and libunwind. I'm not really sure why that would >> affect valgrind's delete vs delete[] detection... :/ It seems like that >> would be intercepted and checked above tcmalloc. > > From research I've been doing it seems this may be a false positive > caused by inlining. Passing "--read-inline-info=yes" to valgrind shows > some inlining in the stacks and the theory is this confuses valgrind. > > I think the following may be the core of the problem. > > ==12443== Mismatched free() / delete / delete [] > ==12443== at 0x4C2ED4A: free (vg_replace_malloc.c:530) > ==12443== by 0x590A474: ??? (in /usr/lib64/libtcmalloc.so.4.3.0) > ==12443== by 0x40109C9: call_init.part.0 (dl-init.c:72) > ==12443== by 0x4010ADA: call_init (dl-init.c:30) > ==12443== by 0x4010ADA: _dl_init (dl-init.c:120) > ==12443== by 0x4000D09: ??? (in /usr/lib64/ld-2.24.so) > ==12443== Address 0x8aadd20 is 0 bytes inside a block of size 4 alloc'd > ==12443== at 0x4C2E8E9: operator new[](unsigned long) > (vg_replace_malloc.c:423) > ==12443== by 0x590A467: ??? (in /usr/lib64/libtcmalloc.so.4.3.0) > ==12443== by 0x40109C9: call_init.part.0 (dl-init.c:72) > ==12443== by 0x4010ADA: call_init (dl-init.c:30) > ==12443== by 0x4010ADA: _dl_init (dl-init.c:120) > ==12443== by 0x4000D09: ??? (in /usr/lib64/ld-2.24.so) > > Having said that it's not clear atm why these aren't showing up all > over the place. > >> >> sage >> >>> >>> >>> >>> >>> > >>> > > >>> > > The traceback: >>> > > #0 0x000000000049ab29 in denc_traits<unsigned long, void>::encode >>> > > (o=@0x10501d0: 123, p=..., f=0) at >>> > > /usr/srcs/Ceph/work/ceph/src/include/denc.h:228 >>> > > #1 0x000000000049a7a8 in denc<unsigned long, denc_traits<unsigned long, >>> > > void> > (o=@0x10501d0: 123, p=..., features=0) at >>> > > /usr/srcs/Ceph/work/ceph/src/include/denc.h:479 >>> > > #2 0x000000000049a6a5 in _denc_friend<foo_t const, >>> > > ceph::buffer::list::contiguous_appender> (v=..., p=...) at >>> > > /usr/srcs/Ceph/work/ceph/src/test/test_denc.cc:155 >>> > > #3 0x000000000049a62d in foo_t::encode (this=0x10501c8, p=...) at >>> > > /usr/srcs/Ceph/work/ceph/src/test/test_denc.cc:152 >>> > > #4 0x000000000049a601 in denc_traits<foo_t, void>::encode (v=..., >>> > > p=..., f=0) at /usr/srcs/Ceph/work/ceph/src/test/test_denc.cc:163 >>> > > #5 0x000000000049a188 in denc<foo_t, denc_traits<foo_t, void> > (o=..., >>> > > p=..., features=0) at /usr/srcs/Ceph/work/ceph/src/include/denc.h:479 >>> > > #6 0x000000000049a0ac in >>> > > denc_traits<std::__1::map<std::__1::basic_string<char, >>> > > std::__1::char_traits<char>, std::__1::allocator<char> >, foo_t, >>> > > std::__1::less<std::__1::basic_string<char, std::__1::char_traits<char>, >>> > > std::__1::allocator<char> > >, >>> > > std::__1::allocator<std::__1::pair<std::__1::basic_string<char, >>> > > std::__1::char_traits<char>, std::__1::allocator<char> > const, foo_t> > >>> > > >, void>::encode<std::__1::basic_string<char, >>> > > std::__1::char_traits<char>, std::__1::allocator<char> > > (v=..., p=...) >>> > > at /usr/srcs/Ceph/work/ceph/src/include/denc.h:1010 >>> > > #7 0x0000000000494581 in >>> > > denc<std::__1::map<std::__1::basic_string<char, >>> > > std::__1::char_traits<char>, std::__1::allocator<char> >, foo_t, >>> > > std::__1::less<std::__1::basic_string<char, std::__1::char_traits<char>, >>> > > std::__1::allocator<char> > >, >>> > > std::__1::allocator<std::__1::pair<std::__1::basic_string<char, >>> > > std::__1::char_traits<char>, std::__1::allocator<char> > const, foo_t> > >>> > > >, denc_traits<std::__1::map<std::__1::basic_string<char, >>> > > std::__1::char_traits<char>, std::__1::allocator<char> >, foo_t, >>> > > std::__1::less<std::__1::basic_string<char, std::__1::char_traits<char>, >>> > > std::__1::allocator<char> > >, >>> > > std::__1::allocator<std::__1::pair<std::__1::basic_string<char, >>> > > std::__1::char_traits<char>, std::__1::allocator<char> > const, foo_t> > >>> > > >, void> > (o=..., p=..., features=0) at >>> > > /usr/srcs/Ceph/work/ceph/src/include/denc.h:479 >>> > > #8 0x0000000000490fb1 in >>> > > test_denc<std::__1::map<std::__1::basic_string<char, >>> > > std::__1::char_traits<char>, std::__1::allocator<char> >, foo_t, >>> > > std::__1::less<std::__1::basic_string<char, std::__1::char_traits<char>, >>> > > std::__1::allocator<char> > >, >>> > > std::__1::allocator<std::__1::pair<std::__1::basic_string<char, >>> > > std::__1::char_traits<char>, std::__1::allocator<char> > const, foo_t> > >>> > > > > (v=...) at /usr/srcs/Ceph/work/ceph/src/test/test_denc.cc:49 >>> > > #9 0x00000000004904e0 in denc_map_Test::TestBody (this=0x1038030) at >>> > > /usr/srcs/Ceph/work/ceph/src/test/test_denc.cc:206 >>> > > >>> > > (gdb) p p >>> > > $10 = (ceph::buffer::list::contiguous_appender &) @0x7fffffffd948: {pbl >>> > > = 0x7fffffffd978, pos = 0x105404f "", bp = {_raw = 0x1054048, _off = 0, >>> > > _len = 72}, deep = false, >>> > > out_of_band_offset = 0} >>> > > (gdb) p *p.bp._raw >>> > > $17 = {_vptr$raw = 0x0, data = 0x1054000 "\003", len = 72, nref = {val = >>> > > 1}, crc_spinlock = 0, crc_map = {__tree_ = {__begin_node_ = 0x1054078, >>> > > __pair1_ = >>> > > {<std::__1::__libcpp_compressed_pair_imp<std::__1::__tree_end_node<std::__1::__tree_node_base<void*>*>, >>> > > std::__1::allocator<std::__1::__tree_node<std::__1::__value_type<std::__1::pair<unsigned >>> > > long, unsigned long>, std::__1::pair<unsigned int, unsigned int> >, >>> > > void*> >, 2>> = >>> > > {<std::__1::allocator<std::__1::__tree_node<std::__1::__value_type<std::__1::pair<unsigned >>> > > long, unsigned long>, std::__1::pair<unsigned int, unsigned int> >, >>> > > void*> >> = {<No data fields>}, __first_ = { >>> > > __left_ = 0x0}}, <No data fields>}, >>> > > __pair3_ = {<std::__1::__libcpp_compressed_pair_imp<unsigned long, >>> > > std::__1::__map_value_compare<std::__1::pair<unsigned long, unsigned >>> > > long>, std::__1::__value_type<std::__1::pair<unsigned long, unsigned >>> > > long>, std::__1::pair<unsigned int, unsigned int> >, >>> > > std::__1::less<std::__1::pair<unsigned long, unsigned long> >, true>, >>> > > 2>> = {<std::__1::__map_value_compare<std::__1::pair<unsigned long, >>> > > unsigned long>, std::__1::__value_type<std::__1::pair<unsigned long, >>> > > unsigned long>, std::__1::pair<unsigned int, unsigned int> >, >>> > > std::__1::less<std::__1::pair<unsigned long, unsigned long> >, true>> = >>> > > {<std::__1::less<std::__1::pair<unsigned long, unsigned long> >> = >>> > > {<std::__1::binary_function<std::__1::pair<unsigned long, unsigned >>> > > long>, std::__1::pair<unsigned long, unsigned long>, bool>> = {<No data >>> > > fields>}, <No data fields>}, <No data fields>}, >>> > > __first_ = 0}, <No data fields>}}}} >>> > > >>> > > (gdb) disassemble >>> > > Dump of assembler code for function denc_traits<unsigned long, >>> > > void>::encode(unsigned long const&, >>> > > ceph::buffer::list::contiguous_appender&, unsigned long): >>> > > 0x000000000049aaf0 <+0>: push %rbp >>> > > 0x000000000049aaf1 <+1>: mov %rsp,%rbp >>> > > 0x000000000049aaf4 <+4>: sub $0x20,%rsp >>> > > 0x000000000049aaf8 <+8>: mov $0x8,%eax >>> > > 0x000000000049aafd <+13>: mov %eax,%ecx >>> > > 0x000000000049aaff <+15>: mov %rdi,-0x8(%rbp) >>> > > 0x000000000049ab03 <+19>: mov %rsi,-0x10(%rbp) >>> > > 0x000000000049ab07 <+23>: mov %rdx,-0x18(%rbp) >>> > > 0x000000000049ab0b <+27>: mov -0x8(%rbp),%rdx >>> > > 0x000000000049ab0f <+31>: mov (%rdx),%rdx >>> > > 0x000000000049ab12 <+34>: mov -0x10(%rbp),%rdi >>> > > 0x000000000049ab16 <+38>: mov %rcx,%rsi >>> > > 0x000000000049ab19 <+41>: mov %rdx,-0x20(%rbp) >>> > > 0x000000000049ab1d <+45>: callq 0x49a1d0 >>> > > <ceph::buffer::list::contiguous_appender::get_pos_add(unsigned long)> >>> > > 0x000000000049ab22 <+50>: mov -0x20(%rbp),%rcx >>> > > 0x000000000049ab26 <+54>: mov %rcx,(%rax) >>> > > => 0x000000000049ab29 <+57>: add $0x20,%rsp >>> > > 0x000000000049ab2d <+61>: pop %rbp >>> > > 0x000000000049ab2e <+62>: retq >>> > > End of assembler dump. >>> > > >>> > > (gdb) info registers >>> > > rax 0x1054047 17121351 >>> > > rbx 0x0 0 >>> > > rcx 0x7b 123 >>> > > rdx 0x7b 123 >>> > > rsi 0x7fffffffd948 140737488345416 >>> > > rdi 0x105404f 17121359 >>> > > rbp 0x7fffffffd4f0 0x7fffffffd4f0 >>> > > rsp 0x7fffffffd4d0 0x7fffffffd4d0 >>> > > r8 0x0 0 >>> > > r9 0x0 0 >>> > > r10 0x0 0 >>> > > r11 0x829100 8556800 >>> > > r12 0x7fffffffe910 140737488349456 >>> > > r13 0x7fffffffe928 140737488349480 >>> > > r14 0x7fffffffe918 140737488349464 >>> > > r15 0x1 1 >>> > > rip 0x49ab29 0x49ab29 <denc_traits<unsigned long, >>> > > void>::encode(unsigned long const&, >>> > > ceph::buffer::list::contiguous_appender&, unsigned long)+57> >>> > > >>> > > And this is wierd: >>> > > %rax contains 0x1054047 >>> > > _raw is 0x1054048 >>> > > >>> > > So writing %rcx to (%rax) is definitly going to overwrite the larger >>> > > part of _raw. >>> > > >>> > > So now the question transforms into: >>> > > Why does %rax hold this wierd value ?? >>> > > or Why does get_pos_add(sizeof(etype)) a negative offset ?? >>> > > >>> > > Got to go and do real dayjob work. >>> > > >>> > > --WjW >>> > > -- >>> > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>> > > the body of a message to majordomo@xxxxxxxxxxxxxxx >>> > > More majordomo info at http://vger.kernel.org/majordomo-info.html >>> > > >>> > > >>> > -- >>> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>> > the body of a message to majordomo@xxxxxxxxxxxxxxx >>> > More majordomo info at http://vger.kernel.org/majordomo-info.html >>> > >>> > >>> -- >>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>> the body of a message to majordomo@xxxxxxxxxxxxxxx >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> >>> >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majordomo@xxxxxxxxxxxxxxx >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > -- > Cheers, > Brad -- Cheers, Brad -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html