Re: Hard to debug problem with ceph_erasure_code

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 1-4-2016 11:34, Willem Jan Withagen wrote:
> On 1-4-2016 07:12, Mykola Golub wrote:
>> On Thu, Mar 31, 2016 at 07:10:45PM +0200, Willem Jan Withagen wrote:
>>
>>> Does anybody have suggestions as how to track/debug this?
>>
>> valgrind?
>>
> 
> Yup, tried that one, but it is sort of hard to find an intermittent
> erroneous write. I tried --track-addr=<addr of m_exp_len> But most of
> the time it is only written  at exact the code line it is supposed to be
> written. So no info there.
> 
> So perhaps I need a different set of tests?
> 
> On average I need about 600 runs to catch one SIGSEGV.
> 
> BTW: tried it on 2 FreeBSD systems, and on both the behaviour is
> identical. So it has got to be the code. And since 65000 runs on Linux
> give no errors, it is also typical for the combo
> FreeBSD/Clang/FreeBSD-packages.

And it gets even weirder.
Clang allows to use of AddressSanitizer
( https://github.com/google/sanitizers/wiki/AddressSanitizer )

So i've compiled all of ceph with
    -fsanitize=address -fno-omit-frame-pointer
And now I've already logged 50.000 runs without crashes.

Feels a lot like Schrödinger's cat....

--WjW


--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux