Re: Bluestore OSDs keep crashing in BlueStore.cc: 8808: FAILED assert(r == 0)

Stefan Priebe - Profihost AG <s.priebe@xxxxxxxxxxxx> · Mon, 16 Sep 2019 08:22:13 +0200

Hi Igor,

Am 12.09.19 um 19:34 schrieb Igor Fedotov:
> Hi Stefan,
> 
> thanks for the update.
> 
> Relevant PR from Paul mentions kernels (4.9+):
> https://github.com/ceph/ceph/pull/23273
> 
> Not sure how correct this is. That's all I have..
> 
> Try asking Sage/Paul...
> 
> Also could you please update the ticket with more details, e..g what are
> the original and new kernel versions

This was just by accident after some days running it happens again. What
i'm wondering is why ceph has a workaround / patch since june 2018 but
Luminous is still unpatched and 12.2.13 might the first containing this
fix - but no idea of an ETA.

Greets,
Stefan

> Thanks,
> 
> Igor
> 
> On 9/12/2019 8:20 PM, Stefan Priebe - Profihost AG wrote:
>> Hello Igor,
>>
>> i can now confirm that this is indeed a kernel bug. The issue does no
>> longer happen on upgraded nodes.
>>
>> Do you know more about it? I really would like to know in which version
>> it was fixed to prevent rebooting all ceph nodes.
>>
>> Greets,
>> Stefan
>>
>> Am 27.08.19 um 16:20 schrieb Igor Fedotov:
>>> It sounds like OSD is "recovering" after checksum error.
>>>
>>> I.e. just failed OSD shows no errors in fsck and is able to restart and
>>> process new write requests for long enough period (longer than just a
>>> couple of minutes). Are these statements true? If so I can suppose this
>>> is accidental/volatile issue rather than data-at-rest corruption.
>>> Something like data incorrectly read from disk.
>>>
>>> Are you using standalone disk drive for DB/WAL or it's shared with main
>>> one? Just in case as a low handing fruit - I'd suggest checking with
>>> dmesg and smartctl for drive errors...
>>>
>>> FYI: one more reference for the similar issue:
>>> https://tracker.ceph.com/issues/24968
>>>
>>> HW issue this time...
>>>
>>>
>>> Also I recall an issue with some kernels that caused occasional invalid
>>> data reads under high memory pressure/swapping:
>>> https://tracker.ceph.com/issues/22464
>>>
>>> IMO memory usage worth checking as well...
>>>
>>>
>>> Igor
>>>
>>>
>>> On 8/27/2019 4:52 PM, Stefan Priebe - Profihost AG wrote:
>>>> see inline
>>>>
>>>> Am 27.08.19 um 15:43 schrieb Igor Fedotov:
>>>>> see inline
>>>>>
>>>>> On 8/27/2019 4:41 PM, Stefan Priebe - Profihost AG wrote:
>>>>>> Hi Igor,
>>>>>>
>>>>>> Am 27.08.19 um 14:11 schrieb Igor Fedotov:
>>>>>>> Hi Stefan,
>>>>>>>
>>>>>>> this looks like a duplicate for
>>>>>>>
>>>>>>> https://tracker.ceph.com/issues/37282
>>>>>>>
>>>>>>> Actually the root cause selection might be quite wide.
>>>>>>>
>>>>>>>    From HW issues to broken logic in RocksDB/BlueStore/BlueFS etc.
>>>>>>>
>>>>>>> As far as I understand you have different OSDs which are failing,
>>>>>>> right?
>>>>>> Yes i've seen this on around 50 different OSDs running different
>>>>>> HW but
>>>>>> all run ceph 12.2.12. I've not seen this with 12.2.10 which we were
>>>>>> running before.
>>>>>>
>>>>>>> Is the set of these broken OSDs limited somehow?
>>>>>> No at least i'm not able to find
>>>>>>
>>>>>>
>>>>>>> Any specific subset which is failing or something? E.g. just N of
>>>>>>> them
>>>>>>> are failing from time to time.
>>>>>> No seems totally random.
>>>>>>
>>>>>>> Any similarities for broken OSDs (e.g. specific hardware)?
>>>>>> All run intel xeon CPUs and all run linux ;-)
>>>>>>
>>>>>>> Did you run fsck for any of broken OSDs? Any reports?
>>>>>> Yes but no reports.
>>>>> Are you saying that fsck is fine for OSDs that showed this sort of
>>>>> errors?
>>>> Yes fsck does not show a single error - everything is fine.
>>>>
>>>>>>> Any other errors/crashes in logs before these sort of issues
>>>>>>> happens?
>>>>>> No
>>>>>>
>>>>>>
>>>>>>> Just in case - what allocator are you using?
>>>>>> tcmalloc
>>>>> I meant BlueStore allocator - is it stupid or bitmap?
>>>> ah the default one i think this is stupid.
>>>>
>>>> Greets,
>>>> Stefan
>>>>
>>>>>> Greets,
>>>>>> Stefan
>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Igor
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 8/27/2019 1:03 PM, Stefan Priebe - Profihost AG wrote:
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> since some month all our bluestore OSDs keep crashing from time to
>>>>>>>> time.
>>>>>>>> Currently about 5 OSDs per day.
>>>>>>>>
>>>>>>>> All of them show the following trace:
>>>>>>>> Trace:
>>>>>>>> 2019-07-24 08:36:48.995397 7fb19a711700 -1 rocksdb:
>>>>>>>> submit_transaction
>>>>>>>> error: Corruption: block checksum mismatch code = 2 Rocksdb
>>>>>>>> transaction:
>>>>>>>> Put( Prefix = M key =
>>>>>>>> 0x00000000000009a5'.0000916366.00000000000074680351' Value size =
>>>>>>>> 184)
>>>>>>>> Put( Prefix = M key = 0x00000000000009a5'._fastinfo' Value size =
>>>>>>>> 186)
>>>>>>>> Put( Prefix = O key =
>>>>>>>> 0x7f8000000000000003bb605f'd!rbd_data.afe49a6b8b4567.0000000000003c11!='0xfffffffffffffffeffffffffffffffff6f00240000'x'
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Value size = 530)
>>>>>>>> Put( Prefix = O key =
>>>>>>>> 0x7f8000000000000003bb605f'd!rbd_data.afe49a6b8b4567.0000000000003c11!='0xfffffffffffffffeffffffffffffffff'o'
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Value size = 510)
>>>>>>>> Put( Prefix = L key = 0x0000000010ba60f1 Value size = 4135)
>>>>>>>> 2019-07-24 08:36:49.012110 7fb19a711700 -1
>>>>>>>> /build/ceph/src/os/bluestore/BlueStore.cc: In function 'void
>>>>>>>> BlueStore::_kv_sync_thread()' thread 7fb19a711700 time 2019-07-24
>>>>>>>> 08:36:48.995415
>>>>>>>> /build/ceph/src/os/bluestore/BlueStore.cc: 8808: FAILED assert(r
>>>>>>>> == 0)
>>>>>>>>
>>>>>>>> ceph version 12.2.12-7-g1321c5e91f
>>>>>>>> (1321c5e91f3d5d35dd5aa5a0029a54b9a8ab9498) luminous (stable)
>>>>>>>>      1: (ceph::__ceph_assert_fail(char const*, char const*, int,
>>>>>>>> char
>>>>>>>> const*)+0x102) [0x5653a010e222]
>>>>>>>>      2: (BlueStore::_kv_sync_thread()+0x24c5) [0x56539ff964b5]
>>>>>>>>      3: (BlueStore::KVSyncThread::entry()+0xd) [0x56539ffd708d]
>>>>>>>>      4: (()+0x7494) [0x7fb1ab2f6494]
>>>>>>>>      5: (clone()+0x3f) [0x7fb1aa37dacf]
>>>>>>>>
>>>>>>>> I already opend up a tracker:
>>>>>>>> https://tracker.ceph.com/issues/41367
>>>>>>>>
>>>>>>>> Can anybody help? Is this known?
>>>>>>>>
>>>>>>>> Greets,
>>>>>>>> Stefan
>>>>>>>> _______________________________________________
>>>>>>>> ceph-users mailing list
>>>>>>>> ceph-users@xxxxxxxxxxxxxx
>>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com