Re: Compiling for FreeBSD, trouble in buffer.c

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10-12-2015 16:03, Willem Jan Withagen wrote:
I have a failure in:
  ./unittest_erasure_code_shec_arguments
All tests befor this PASS. (other than rbd which is disabled to
the time being)

Which I traceback to code in ErasureCodeShec.cc
Line 218:
     unsigned blocksize = (*chunks.begin()).second.length();
After a few iterations I get a "negative" blocksize, which causes
allocations further on to really thrash the system out of swap.

At first I expected it could be due to a Clang typecasting problem.
But after more debugging I found the following in
buffer.h
     unsigned length() const {
#if 0
       // DEBUG: verify _len
       unsigned len = 0;
       for (std::list<ptr>::const_iterator it = _buffers.begin();
            it != _buffers.end();
            it++) {
         len += (*it).length();
       }
       assert(len == _len);
#endif
       return _len;
     }

Which suggests that debugging was needed at this point earlier in life.
If I enable this debug block, I do get the assert affected.

Now the next question is why? Given the debug snippet it needed
analyzing before.
And the derived question then is:
     What is the easiest path to find out what is actually wrong here.


A further followup on this.

After some extensive debugging with gdb and watches, I've come to the conclusion
That the location of _len is used by more that one part of the code...
The location gets alternately written during:
TestErasureCodeShec_arguments.cc:136
    shec_table.insert(std::make_pair(table_key,table_value));

Old value = 63015016
New value = 4294954344
....
Old value = 4294954344
New value = 63015016
.....

To retain this value 4294954344, which is definitely not the length.
Because printing values on the Linux variant, it gives 32. Which sounds much more
sensible....

So there a few possibilities that I can think of:
 1) Clang gets it wrong
2) There is a mixup of different type of libs that make for different offsets in
    the bufferlist structs
 3) the bufferlist code is has portability issues
 4) the bufferlist code has errors that do no show with gcc

Most likely it will be either 2) or 3) ....
But other suggestions are welcome...

And since bufferlists are at the center of Ceph, better get things right.
So I'm going to go over the test/bufferlist.cc code and see what is in there. And/or extract a less convoluted example from TestErasureCodeShec_arguments.cc
and see if it is in there as well.

--WjW





	
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux